ftrac

Descripción del contenido de la página

Herramienta en Forth para traducir caracteres dentro de una cadena de texto en subcadenas.

Etiquetas:

Escribí este programa en Forth para traducir caracteres, dentro de una cadena de texto, en cadenas de texto; es una versión más especializada y más rápida de su predecesor ftra.

Código fuente

ftrac.fs

.( ftrac.fs) cr

\ Copyright (C) 2005,2006,2007,2008,2013 Marcos Cruz (programandala.net)
\ License: http://programandala.net/license

\ Character translation tool
\ Program written in Forth; developed with Forth 5mx and Gforth.

\ --------------------------------------------------------------
\ History

\ 2005-11-13
\ Start. First working version.

\ 2006-06-04
\ Bug fixed: 'trac' translated an empty string into an arbitrary one
\ character string. To fix it, '?do' has replaced 'do'.

\ 2006-06-23
\ '?do' does not work yet in Forth 5mx. Meanwhile, 'ftrac' uses an
\ 'if' instead.

\ 2006-09-25
\ '?do' used in 'ftrac'.
  
\ 2006-09-26
\ A bit faster version of 'ftrac', also with '?do'.
\ Second version of 'c>s', 33% faster.
\ Third version of 'c>s', almost 1% faster than the second one.

\ 2007-03-30
\ 'trac-array:' improved to activate the array just when created.

\ 2008-04-27
\ String length implemented.
\ This way the final translation can be longer than 255.

\ 2013-12-11
\ Source code style and stack notation updated; fixed some comments.

\ 2013-12-12
\ 'bounds' used instead of the idiom 'over + swap';
\ '?trac@' renamed to 'trac@?'.

\ --------------------------------------------------------------

marker ftrac

true constant long-translation?

\ String length management:
defer sl!  defer sl+  defer sl@
long-translation?
[if]    ' ! is sl!  ' @ is sl@  ' cell+ is sl+
[else]  ' c! is sl!  ' c@ is sl@  ' char+ is sl+
[then]
: +count  ( ca1 -- ca2 u )
  dup sl+ swap sl@
  ;
variable trac-array  \ address of the current translation table
: trac-array:  ( "<spaces>name" -- )
  \ Create a translation array.
  create  here 256 cells  dup allot
  over trac-array !  \ activate the just created array
  erase
  does>  ( -- ) ( dfa ) trac-array !  \ activate it
  ;
: trac!  ( c ca len -- )
  \ Update the current translation array
  \ to translate a char to a string.
  here >r s, align  cells trac-array @ + r> swap !
  ;  
: (trac@)  ( n -- ca | 0 )
  \ Fetch the content of an element of the current translation array.
  cells trac-array @ + @
  ;
: trac@?  ( c -- ca len true | false )
  \ Fetch the translation of a char
  \ according to the current translation array.
  (trac@) dup if  count true  then
  ;
false [if]
: c>s  ( c -- ca )
  \ Create a one char counted string.
  \ First version.
  1 pad c! pad 1+ c! pad
  ;
[then]
create str1  1 c, 0 c, align
str1 1+ constant (str1)
false [if]
: c>s  ( c -- ca )
  \ Create a one char counted string.
  \ Second version, 33% faster.
  (str1) c! str1
  ;
[then]
: c>s  ( c -- ca len )
  \ Create a one char counted string.
  \ Third version, almost 1% faster than the second one
  \ (considering the 'count' no more needed after calling).
   (str1) c! (str1) 1
   ;
: trac@  ( c -- ca len )
  \ Fetch the translation of a char according to the current translation array.
  [ false ] [if]
    dup (trac@) dup if  nip  else  drop c>s  then
  [else]  \ Alternative code
    dup trac@? if  rot drop  else  c>s  then
  [then]
  ;
variable translation  \ address of the translation
variable >translation  \ pointer to the next char in the translation
: translation+!  ( ca len -- )
  \ Add a string to the current translation.
  dup >r >translation @ swap cmove  \ move chars
  r> dup >translation +!  \ update char pointer
  translation @ dup sl@ rot + swap sl!  \ update string count
  ;
: translation0  ( -- )
  \ Init the translation.
  here
  dup translation !  \ address
  0 over sl!  \ nul lenght
  sl+ >translation !  \ init the pointer to the first char
  ;
: trac  ( ca1 len1 -- ca2 len2 )
  \ Translate a string using the current translation array.
  [ false ] [if]  
    \ Old version without ?do
    translation0  0 2dup =
    if    2drop
    else  do  dup i + c@ trac@ translation+!  loop
    then  drop  translation @ count
  [then]
  [ false ] [if]  
    translation0
    0 ?do  dup i + c@ trac@ translation+!  loop
    drop  translation @ count
  [then]
  \ Benchmark (building a whole website with fhp on Forth 5mx):
  \   3560 seconds with the previous version
  \   3447 seconds with this version (3% faster)
  translation0  bounds
  ?do  i c@ trac@ translation+!  loop
  translation @ +count
  ;
: .trac-array  ( -- )
  \ Print the content of the current translation array.
  255 0 do
    i trac@? if
      i dup 3 .r emit space type cr
      \ more 0= if  leave  then
    then
  loop
  ;

\ --------------------------------------------------------------
\ Benchmark

false [if]

: benchmark  ( loops -- )
  >r time&date
  r@ 0 do bl c>s count 2drop loop
  -seconds cr . ." seconds the old one"
  time&date
  r@ 0 do bl c>s1 2drop loop
  -seconds cr .  ." seconds the new one"
  r> drop
  ;
 
[then]

.( ftrac.fs compiled) cr

ftrac-utf8.fs

\ ****************************
CR .( ftrac-utf8 )
\ ****************************

\ Copyright (C) 2005 Marcos Cruz (http://programandala.net)
\ License: http://programandala.net/license

\ Translation table for ftra:
\ Some translations from the Windows character set (windows-1252;  Unofficial synonyms include cp-1252 and WinLatin1.) into numeric HTML Unicode.

\ Special translations provided by Jukka "Yucca" Korpela in his web page.

\ Note: code positions 128·-·159 have been reserved	for eventual use as control codes.

\ 2005 Written
\ 2013-12-11 conditional 'include ftrac.f' to 'require ftrac.fs'.

require ./ftrac.fs

trac-array: trac-utf8 

\ CHAR € S" ..." trac!  \
CHAR ‚ S" ‚" trac!  \ Jukka
CHAR ƒ S" ƒ" trac!  \ Jukka
CHAR „ S" „" trac!  \ Jukka
CHAR … S" …" trac!  \ Jukka
CHAR † S" †" trac!  \ Jukka
CHAR ‡ S" ‡" trac!  \ Jukka
CHAR ˆ S" ˆ" trac!  \ Jukka
CHAR ‰ S" ‰" trac!  \ Jukka
CHAR Š S" Š" trac!  \ Jukka
CHAR ‹ S" ‹" trac!  \ Jukka
CHAR Œ S" Œ" trac!  \ Jukka
\ CHAR Ž S" ..." trac!
CHAR ‘ S" ‘" trac!  \ Jukka
CHAR ’ S" ߣ" trac!  \ Jukka
CHAR “ S" “" trac!  \ Jukka
CHAR ” S" ”" trac!  \ Jukka
CHAR • S" ߦ" trac!  \ Jukka
CHAR – S" –" trac!  \ Jukka
CHAR — S" —" trac!  \ Jukka
CHAR ˜ S" ˜" trac!  \ Jukka
CHAR ™ S" ™" trac!  \ Jukka
CHAR š S" š" trac!  \ Jukka
CHAR › S" ›" trac!  \ Jukka
CHAR œ S" œ" trac!  \ Jukka
\ CHAR ž S" ..." trac!
CHAR Ÿ S" Ÿ" trac!  \ Jukka
CHAR ¡ S" ¡" trac! 
CHAR ª S" ª" trac! 
CHAR « S" «" trac! 
\ CHAR ¬ S" ..." trac!
CHAR · S" ·" trac! 
CHAR º S" º" trac! 
CHAR » S" »" trac! 
CHAR ¿ S" ¿" trac! 
CHAR À S" À" trac! 
CHAR Á S" Á" trac! 
CHAR Ç S" Ç" trac! 
CHAR È S" È" trac! 
CHAR É S" É" trac! 
CHAR Í S" Í" trac! 
CHAR Ñ S" Ñ" trac! 
CHAR Ó S" Ó" trac! 
CHAR Ú S" Ú" trac! 
CHAR Ü S" Ü" trac! 
CHAR à S" à" trac! 
CHAR á S" á" trac! 
CHAR ä S" ä" trac! 
CHAR ç S" ç" trac! 
CHAR è S" è" trac! 
CHAR é S" é" trac! 
CHAR í S" í" trac! 
CHAR ñ S" ñ" trac! 
CHAR ó S" ó" trac! 
CHAR ô S" ô" trac! 
CHAR ú S" ú" trac! 
CHAR ü S" ü" trac!

.(  ftrac-utf8 ok!)

ftrac-utf8es.fs

\ ****************************
CR .( ftrac-utf8es )
\ ****************************

\ Copyright (C) 2005 Marcos Cruz (programandala.net)
\ License: http://programandala.net/license

\ Translation table for ftra:
\ Translation from the Windows character set (windows-1252;  Unofficial synonyms include cp-1252 and WinLatin1.) into numeric HTML Unicode.
\ Sorted to translate Spanish texts more quickly.

\ Special translations provided by Jukka "Yucca" Korpela in his web page.

[undefined] ftrac [IF]
	include ftrac.f
[THEN]

trac-array: trac-utf8es 

\ The most used, Spanish chars

CHAR á S" á" trac!
CHAR é S" é" trac!
CHAR í S" í" trac!
CHAR ó S" ó" trac!
CHAR ú S" ú" trac!
CHAR ñ S" ñ" trac!
CHAR Á S" Á" trac!
CHAR É S" É" trac!
CHAR Í S" Í" trac!
CHAR Ó S" Ó" trac!
CHAR Ú S" Ú" trac!
CHAR Ñ S" Ñ" trac!
CHAR ¿ S" ¿" trac!
CHAR ¡ S" ¡" trac!
CHAR ü S" ü" trac!
CHAR Ü S" Ü" trac!

\ Some tipographical chars

CHAR « S" «" trac!
CHAR · S" ·" trac!
CHAR ª S" ª" trac!
CHAR º S" º" trac!
CHAR » S" »" trac!

\ Chars from other languages

CHAR à S" à" trac!
CHAR ä S" ä" trac!
CHAR ç S" ç" trac!
CHAR À S" À" trac!
CHAR Ç S" Ç" trac!
CHAR È S" È" trac!
CHAR è S" è" trac! 
CHAR ô S" ô" trac!
CHAR ò S" ò" trac!
CHAR Ò S" Ò" trac!

\ Special chars

160 S"  " trac!  \ 160

\ Problem chars (128-159)

false [IF]

\ Original translations for chars 128-159

CHAR ‚ S" ‚" trac!  \ Jukka
CHAR ƒ S" ƒ" trac!  \ Jukka
CHAR „ S" „" trac!  \ Jukka
CHAR … S" …" trac!  \ Jukka
CHAR † S" †" trac!  \ Jukka
CHAR ‡ S" ‡" trac!  \ Jukka
CHAR ˆ S" ˆ" trac!  \ Jukka
CHAR ‰ S" ‰" trac!  \ Jukka
CHAR Š S" Š" trac!  \ Jukka
CHAR ‹ S" ‹" trac!  \ Jukka
CHAR Œ S" Œ" trac!  \ Jukka
CHAR ‘ S" ‘" trac!  \ Jukka
CHAR ’ S" ߣ" trac!  \ Jukka
CHAR “ S" “" trac!  \ Jukka
CHAR ” S" ”" trac!  \ Jukka
CHAR • S" ߦ" trac!  \ Jukka
CHAR – S" –" trac!  \ Jukka
CHAR — S" —" trac!  \ Jukka
CHAR ˜ S" ˜" trac!  \ Jukka
CHAR ™ S" ™" trac!  \ Jukka
CHAR š S" š" trac!  \ Jukka
CHAR › S" ›" trac!  \ Jukka
CHAR œ S" œ" trac!  \ Jukka
CHAR Ÿ S" Ÿ" trac!  \ Jukka

[ELSE]

\ 2006 04 03

\ Actual secure translations for chars 128-159

CHAR € S" EUR" trac!  \ 128
CHAR ‚ S" '" trac!  \ 130
CHAR „ S" "" trac!  \ 132
CHAR ‹ S" <" trac!  \ 139
CHAR ΠS" OE" trac!  \ 140
CHAR ‘ S" '" trac!  \ 145
CHAR ’ S" '" trac!  \ 146
CHAR “ S" "" trac!  \ 147
CHAR ” S" "" trac!  \ 148
CHAR • S" ·" trac!  \ 149
CHAR – S" -" trac!  \ 150
CHAR — S"  - " trac!  \ 151
CHAR ˜ S" ~" trac!  \ 152
CHAR ™ S" (TM)" trac!  \ 153
CHAR › S" >" trac!  \ 155
CHAR œ S" oe" trac!  \ 156

[THEN]

.(  ftrac-utf8es ok!)

ftrac-uri.fs

\ ****************************
CR .( ftrac-uri )
\ ****************************

\ Copyright (C) 2008 Marcos Cruz (programandala.net)
\ License: http://programandala.net/license

\ Translation table for ftrac:
\ Some translations to use URI as link parameteres,
\ for example in social bookmarking.

[undefined] ftrac [IF]
	include ftrac.f
[THEN]

\ Table to translate URIs

trac-array: trac-uri-EX

BL S" %20" trac!
CHAR " S" %22" trac!
CHAR / S" %2F" trac!
CHAR : S" %3A" trac!

\ Table to translate URIs passed as parameters inside other URIs

trac-array: trac-uri

BL S" %20" trac!
CHAR " S" %22" trac!
CHAR & S" %26" trac!
CHAR + S" %2B" trac!
CHAR / S" %2F" trac!
CHAR : S" %3A" trac!
CHAR = S" %3D" trac!
CHAR ? S" %3F" trac!

.(  ftrac-uri ok!)

Descargas

Páginas relacionadas

ftra
Herramienta en Forth para traducir subcadenas de texto.