ftrac
Descripción del contenido de la página
Herramienta en Forth para traducir caracteres dentro de una cadena de texto en subcadenas.
Escribí este programa en Forth para traducir caracteres, dentro de una cadena de texto, en cadenas de texto; es una versión más especializada y más rápida de su predecesor ftra.
Código fuente
ftrac.fs
.( ftrac.fs) cr
\ Copyright (C) 2005,2006,2007,2008,2013 Marcos Cruz (programandala.net)
\ License: http://programandala.net/license
\ Character translation tool
\ Program written in Forth; developed with Forth 5mx and Gforth.
\ --------------------------------------------------------------
\ History
\ 2005-11-13
\ Start. First working version.
\ 2006-06-04
\ Bug fixed: 'trac' translated an empty string into an arbitrary one
\ character string. To fix it, '?do' has replaced 'do'.
\ 2006-06-23
\ '?do' does not work yet in Forth 5mx. Meanwhile, 'ftrac' uses an
\ 'if' instead.
\ 2006-09-25
\ '?do' used in 'ftrac'.
\ 2006-09-26
\ A bit faster version of 'ftrac', also with '?do'.
\ Second version of 'c>s', 33% faster.
\ Third version of 'c>s', almost 1% faster than the second one.
\ 2007-03-30
\ 'trac-array:' improved to activate the array just when created.
\ 2008-04-27
\ String length implemented.
\ This way the final translation can be longer than 255.
\ 2013-12-11
\ Source code style and stack notation updated; fixed some comments.
\ 2013-12-12
\ 'bounds' used instead of the idiom 'over + swap';
\ '?trac@' renamed to 'trac@?'.
\ --------------------------------------------------------------
marker ftrac
true constant long-translation?
\ String length management:
defer sl! defer sl+ defer sl@
long-translation?
[if] ' ! is sl! ' @ is sl@ ' cell+ is sl+
[else] ' c! is sl! ' c@ is sl@ ' char+ is sl+
[then]
: +count ( ca1 -- ca2 u )
dup sl+ swap sl@
;
variable trac-array \ address of the current translation table
: trac-array: ( "<spaces>name" -- )
\ Create a translation array.
create here 256 cells dup allot
over trac-array ! \ activate the just created array
erase
does> ( -- ) ( dfa ) trac-array ! \ activate it
;
: trac! ( c ca len -- )
\ Update the current translation array
\ to translate a char to a string.
here >r s, align cells trac-array @ + r> swap !
;
: (trac@) ( n -- ca | 0 )
\ Fetch the content of an element of the current translation array.
cells trac-array @ + @
;
: trac@? ( c -- ca len true | false )
\ Fetch the translation of a char
\ according to the current translation array.
(trac@) dup if count true then
;
false [if]
: c>s ( c -- ca )
\ Create a one char counted string.
\ First version.
1 pad c! pad 1+ c! pad
;
[then]
create str1 1 c, 0 c, align
str1 1+ constant (str1)
false [if]
: c>s ( c -- ca )
\ Create a one char counted string.
\ Second version, 33% faster.
(str1) c! str1
;
[then]
: c>s ( c -- ca len )
\ Create a one char counted string.
\ Third version, almost 1% faster than the second one
\ (considering the 'count' no more needed after calling).
(str1) c! (str1) 1
;
: trac@ ( c -- ca len )
\ Fetch the translation of a char according to the current translation array.
[ false ] [if]
dup (trac@) dup if nip else drop c>s then
[else] \ Alternative code
dup trac@? if rot drop else c>s then
[then]
;
variable translation \ address of the translation
variable >translation \ pointer to the next char in the translation
: translation+! ( ca len -- )
\ Add a string to the current translation.
dup >r >translation @ swap cmove \ move chars
r> dup >translation +! \ update char pointer
translation @ dup sl@ rot + swap sl! \ update string count
;
: translation0 ( -- )
\ Init the translation.
here
dup translation ! \ address
0 over sl! \ nul lenght
sl+ >translation ! \ init the pointer to the first char
;
: trac ( ca1 len1 -- ca2 len2 )
\ Translate a string using the current translation array.
[ false ] [if]
\ Old version without ?do
translation0 0 2dup =
if 2drop
else do dup i + c@ trac@ translation+! loop
then drop translation @ count
[then]
[ false ] [if]
translation0
0 ?do dup i + c@ trac@ translation+! loop
drop translation @ count
[then]
\ Benchmark (building a whole website with fhp on Forth 5mx):
\ 3560 seconds with the previous version
\ 3447 seconds with this version (3% faster)
translation0 bounds
?do i c@ trac@ translation+! loop
translation @ +count
;
: .trac-array ( -- )
\ Print the content of the current translation array.
255 0 do
i trac@? if
i dup 3 .r emit space type cr
\ more 0= if leave then
then
loop
;
\ --------------------------------------------------------------
\ Benchmark
false [if]
: benchmark ( loops -- )
>r time&date
r@ 0 do bl c>s count 2drop loop
-seconds cr . ." seconds the old one"
time&date
r@ 0 do bl c>s1 2drop loop
-seconds cr . ." seconds the new one"
r> drop
;
[then]
.( ftrac.fs compiled) cr
ftrac-utf8.fs
\ ****************************
CR .( ftrac-utf8 )
\ ****************************
\ Copyright (C) 2005 Marcos Cruz (http://programandala.net)
\ License: http://programandala.net/license
\ Translation table for ftra:
\ Some translations from the Windows character set (windows-1252; Unofficial synonyms include cp-1252 and WinLatin1.) into numeric HTML Unicode.
\ Special translations provided by Jukka "Yucca" Korpela in his web page.
\ Note: code positions 128·-·159 have been reserved for eventual use as control codes.
\ 2005 Written
\ 2013-12-11 conditional 'include ftrac.f' to 'require ftrac.fs'.
require ./ftrac.fs
trac-array: trac-utf8
\ CHAR € S" ..." trac! \
CHAR ‚ S" ‚" trac! \ Jukka
CHAR ƒ S" ƒ" trac! \ Jukka
CHAR „ S" „" trac! \ Jukka
CHAR … S" …" trac! \ Jukka
CHAR † S" †" trac! \ Jukka
CHAR ‡ S" ‡" trac! \ Jukka
CHAR ˆ S" ˆ" trac! \ Jukka
CHAR ‰ S" ‰" trac! \ Jukka
CHAR Š S" Š" trac! \ Jukka
CHAR ‹ S" ‹" trac! \ Jukka
CHAR Œ S" Œ" trac! \ Jukka
\ CHAR Ž S" ..." trac!
CHAR ‘ S" ‘" trac! \ Jukka
CHAR ’ S" ߣ" trac! \ Jukka
CHAR “ S" “" trac! \ Jukka
CHAR ” S" ”" trac! \ Jukka
CHAR • S" ߦ" trac! \ Jukka
CHAR – S" –" trac! \ Jukka
CHAR — S" —" trac! \ Jukka
CHAR ˜ S" ˜" trac! \ Jukka
CHAR ™ S" ™" trac! \ Jukka
CHAR š S" š" trac! \ Jukka
CHAR › S" ›" trac! \ Jukka
CHAR œ S" œ" trac! \ Jukka
\ CHAR ž S" ..." trac!
CHAR Ÿ S" Ÿ" trac! \ Jukka
CHAR ¡ S" ¡" trac!
CHAR ª S" ª" trac!
CHAR « S" «" trac!
\ CHAR ¬ S" ..." trac!
CHAR · S" ·" trac!
CHAR º S" º" trac!
CHAR » S" »" trac!
CHAR ¿ S" ¿" trac!
CHAR À S" À" trac!
CHAR Á S" Á" trac!
CHAR Ç S" Ç" trac!
CHAR È S" È" trac!
CHAR É S" É" trac!
CHAR Í S" Í" trac!
CHAR Ñ S" Ñ" trac!
CHAR Ó S" Ó" trac!
CHAR Ú S" Ú" trac!
CHAR Ü S" Ü" trac!
CHAR à S" à" trac!
CHAR á S" á" trac!
CHAR ä S" ä" trac!
CHAR ç S" ç" trac!
CHAR è S" è" trac!
CHAR é S" é" trac!
CHAR í S" í" trac!
CHAR ñ S" ñ" trac!
CHAR ó S" ó" trac!
CHAR ô S" ô" trac!
CHAR ú S" ú" trac!
CHAR ü S" ü" trac!
.( ftrac-utf8 ok!)
ftrac-utf8es.fs
\ ****************************
CR .( ftrac-utf8es )
\ ****************************
\ Copyright (C) 2005 Marcos Cruz (programandala.net)
\ License: http://programandala.net/license
\ Translation table for ftra:
\ Translation from the Windows character set (windows-1252; Unofficial synonyms include cp-1252 and WinLatin1.) into numeric HTML Unicode.
\ Sorted to translate Spanish texts more quickly.
\ Special translations provided by Jukka "Yucca" Korpela in his web page.
[undefined] ftrac [IF]
include ftrac.f
[THEN]
trac-array: trac-utf8es
\ The most used, Spanish chars
CHAR á S" á" trac!
CHAR é S" é" trac!
CHAR í S" í" trac!
CHAR ó S" ó" trac!
CHAR ú S" ú" trac!
CHAR ñ S" ñ" trac!
CHAR Á S" Á" trac!
CHAR É S" É" trac!
CHAR Í S" Í" trac!
CHAR Ó S" Ó" trac!
CHAR Ú S" Ú" trac!
CHAR Ñ S" Ñ" trac!
CHAR ¿ S" ¿" trac!
CHAR ¡ S" ¡" trac!
CHAR ü S" ü" trac!
CHAR Ü S" Ü" trac!
\ Some tipographical chars
CHAR « S" «" trac!
CHAR · S" ·" trac!
CHAR ª S" ª" trac!
CHAR º S" º" trac!
CHAR » S" »" trac!
\ Chars from other languages
CHAR à S" à" trac!
CHAR ä S" ä" trac!
CHAR ç S" ç" trac!
CHAR À S" À" trac!
CHAR Ç S" Ç" trac!
CHAR È S" È" trac!
CHAR è S" è" trac!
CHAR ô S" ô" trac!
CHAR ò S" ò" trac!
CHAR Ò S" Ò" trac!
\ Special chars
160 S" " trac! \ 160
\ Problem chars (128-159)
false [IF]
\ Original translations for chars 128-159
CHAR ‚ S" ‚" trac! \ Jukka
CHAR ƒ S" ƒ" trac! \ Jukka
CHAR „ S" „" trac! \ Jukka
CHAR … S" …" trac! \ Jukka
CHAR † S" †" trac! \ Jukka
CHAR ‡ S" ‡" trac! \ Jukka
CHAR ˆ S" ˆ" trac! \ Jukka
CHAR ‰ S" ‰" trac! \ Jukka
CHAR Š S" Š" trac! \ Jukka
CHAR ‹ S" ‹" trac! \ Jukka
CHAR Œ S" Œ" trac! \ Jukka
CHAR ‘ S" ‘" trac! \ Jukka
CHAR ’ S" ߣ" trac! \ Jukka
CHAR “ S" “" trac! \ Jukka
CHAR ” S" ”" trac! \ Jukka
CHAR • S" ߦ" trac! \ Jukka
CHAR – S" –" trac! \ Jukka
CHAR — S" —" trac! \ Jukka
CHAR ˜ S" ˜" trac! \ Jukka
CHAR ™ S" ™" trac! \ Jukka
CHAR š S" š" trac! \ Jukka
CHAR › S" ›" trac! \ Jukka
CHAR œ S" œ" trac! \ Jukka
CHAR Ÿ S" Ÿ" trac! \ Jukka
[ELSE]
\ 2006 04 03
\ Actual secure translations for chars 128-159
CHAR € S" EUR" trac! \ 128
CHAR ‚ S" '" trac! \ 130
CHAR „ S" "" trac! \ 132
CHAR ‹ S" <" trac! \ 139
CHAR Œ S" OE" trac! \ 140
CHAR ‘ S" '" trac! \ 145
CHAR ’ S" '" trac! \ 146
CHAR “ S" "" trac! \ 147
CHAR ” S" "" trac! \ 148
CHAR • S" ·" trac! \ 149
CHAR – S" -" trac! \ 150
CHAR — S" - " trac! \ 151
CHAR ˜ S" ~" trac! \ 152
CHAR ™ S" (TM)" trac! \ 153
CHAR › S" >" trac! \ 155
CHAR œ S" oe" trac! \ 156
[THEN]
.( ftrac-utf8es ok!)
ftrac-uri.fs
\ ****************************
CR .( ftrac-uri )
\ ****************************
\ Copyright (C) 2008 Marcos Cruz (programandala.net)
\ License: http://programandala.net/license
\ Translation table for ftrac:
\ Some translations to use URI as link parameteres,
\ for example in social bookmarking.
[undefined] ftrac [IF]
include ftrac.f
[THEN]
\ Table to translate URIs
trac-array: trac-uri-EX
BL S" %20" trac!
CHAR " S" %22" trac!
CHAR / S" %2F" trac!
CHAR : S" %3A" trac!
\ Table to translate URIs passed as parameters inside other URIs
trac-array: trac-uri
BL S" %20" trac!
CHAR " S" %22" trac!
CHAR & S" %26" trac!
CHAR + S" %2B" trac!
CHAR / S" %2F" trac!
CHAR : S" %3A" trac!
CHAR = S" %3D" trac!
CHAR ? S" %3F" trac!
.( ftrac-uri ok!)