ftrac

Description of the page content

Forth tool that translates characters in a string into substrings.

Tags:

I wrote this program in Forth in order to translate characters of a string into substrings. it's a faster and more specilized version of its predecessor, ftra.

Source code

ftrac.fs

.( ftrac.fs) cr

\ Copyright (C) 2005,2006,2007,2008,2013 Marcos Cruz (programandala.net)
\ License: http://programandala.net/license

\ Character translation tool
\ Program written in Forth; developed with Forth 5mx and Gforth.

\ --------------------------------------------------------------
\ History

\ 2005-11-13
\ Start. First working version.

\ 2006-06-04
\ Bug fixed: 'trac' translated an empty string into an arbitrary one
\ character string. To fix it, '?do' has replaced 'do'.

\ 2006-06-23
\ '?do' does not work yet in Forth 5mx. Meanwhile, 'ftrac' uses an
\ 'if' instead.

\ 2006-09-25
\ '?do' used in 'ftrac'.

\ 2006-09-26
\ A bit faster version of 'ftrac', also with '?do'.
\ Second version of 'c>s', 33% faster.
\ Third version of 'c>s', almost 1% faster than the second one.

\ 2007-03-30
\ 'trac-array:' improved to activate the array just when created.

\ 2008-04-27
\ String length implemented.
\ This way the final translation can be longer than 255.

\ 2013-12-11
\ Source code style and stack notation updated; fixed some comments.

\ 2013-12-12
\ 'bounds' used instead of the idiom 'over + swap';
\ '?trac@' renamed to 'trac@?'.

\ --------------------------------------------------------------

marker ftrac

true constant long-translation?

\ String length management:
defer sl!  defer sl+  defer sl@
long-translation?
[if]    ' ! is sl!  ' @ is sl@  ' cell+ is sl+
[else]  ' c! is sl!  ' c@ is sl@  ' char+ is sl+
[then]
: +count  ( ca1 -- ca2 u )
  dup sl+ swap sl@
  ;
variable trac-array  \ address of the current translation table
: trac-array:  ( "<spaces>name" -- )
  \ Create a translation array.
  create  here 256 cells  dup allot
  over trac-array !  \ activate the just created array
  erase
  does>  ( -- ) ( dfa ) trac-array !  \ activate it
  ;
: trac!  ( c ca len -- )
  \ Update the current translation array
  \ to translate a char to a string.
  here >r s, align  cells trac-array @ + r> swap !
  ;
: (trac@)  ( n -- ca | 0 )
  \ Fetch the content of an element of the current translation array.
  cells trac-array @ + @
  ;
: trac@?  ( c -- ca len true | false )
  \ Fetch the translation of a char
  \ according to the current translation array.
  (trac@) dup if  count true  then
  ;
false [if]
: c>s  ( c -- ca )
  \ Create a one char counted string.
  \ First version.
  1 pad c! pad 1+ c! pad
  ;
[then]
create str1  1 c, 0 c, align
str1 1+ constant (str1)
false [if]
: c>s  ( c -- ca )
  \ Create a one char counted string.
  \ Second version, 33% faster.
  (str1) c! str1
  ;
[then]
: c>s  ( c -- ca len )
  \ Create a one char counted string.
  \ Third version, almost 1% faster than the second one
  \ (considering the 'count' no more needed after calling).
   (str1) c! (str1) 1
   ;
: trac@  ( c -- ca len )
  \ Fetch the translation of a char according to the current translation array.
  [ false ] [if]
    dup (trac@) dup if  nip  else  drop c>s  then
  [else]  \ Alternative code
    dup trac@? if  rot drop  else  c>s  then
  [then]
  ;
variable translation  \ address of the translation
variable >translation  \ pointer to the next char in the translation
: translation+!  ( ca len -- )
  \ Add a string to the current translation.
  dup >r >translation @ swap cmove  \ move chars
  r> dup >translation +!  \ update char pointer
  translation @ dup sl@ rot + swap sl!  \ update string count
  ;
: translation0  ( -- )
  \ Init the translation.
  here
  dup translation !  \ address
  0 over sl!  \ nul lenght
  sl+ >translation !  \ init the pointer to the first char
  ;
: trac  ( ca1 len1 -- ca2 len2 )
  \ Translate a string using the current translation array.
  [ false ] [if]
    \ Old version without ?do
    translation0  0 2dup =
    if    2drop
    else  do  dup i + c@ trac@ translation+!  loop
    then  drop  translation @ count
  [then]
  [ false ] [if]
    translation0
    0 ?do  dup i + c@ trac@ translation+!  loop
    drop  translation @ count
  [then]
  \ Benchmark (building a whole website with fhp on Forth 5mx):
  \   3560 seconds with the previous version
  \   3447 seconds with this version (3% faster)
  translation0  bounds
  ?do  i c@ trac@ translation+!  loop
  translation @ +count
  ;
: .trac-array  ( -- )
  \ Print the content of the current translation array.
  255 0 do
    i trac@? if
      i dup 3 .r emit space type cr
      \ more 0= if  leave  then
    then
  loop
  ;

\ --------------------------------------------------------------
\ Benchmark

false [if]

: benchmark  ( loops -- )
  >r time&date
  r@ 0 do bl c>s count 2drop loop
  -seconds cr . ." seconds the old one"
  time&date
  r@ 0 do bl c>s1 2drop loop
  -seconds cr .  ." seconds the new one"
  r> drop
  ;

[then]

.( ftrac.fs compiled) cr

ftrac-utf8.fs

\ ****************************
CR .( ftrac-utf8 )
\ ****************************

\ Copyright (C) 2005 Marcos Cruz (http://programandala.net)
\ License: http://programandala.net/license

\ Translation table for ftra:
\ Some translations from the Windows character set (windows-1252;  Unofficial synonyms include cp-1252 and WinLatin1.) into numeric HTML Unicode.

\ Special translations provided by Jukka "Yucca" Korpela in his web page.

\ Note: code positions 128·-·159 have been reserved for eventual use as control codes.

\ 2005 Written
\ 2013-12-11 conditional 'include ftrac.f' to 'require ftrac.fs'.

require ./ftrac.fs

trac-array: trac-utf8

\ CHAR € S" ..." trac!  \
CHARS" &#8218;" trac!  \ Jukka
CHAR ƒ S" &#402;" trac!  \ Jukka
CHARS" &#8222;" trac!  \ Jukka
CHARS" &#8230;" trac!  \ Jukka
CHARS" &#8224;" trac!  \ Jukka
CHARS" &#8225;" trac!  \ Jukka
CHAR ˆ S" &#710;" trac!  \ Jukka
CHARS" &#8240;" trac!  \ Jukka
CHAR Š S" &#352;" trac!  \ Jukka
CHARS" &#8249;" trac!  \ Jukka
CHAR ΠS" &#338;" trac!  \ Jukka
\ CHAR Ž S" ..." trac!
CHARS" &#8216;" trac!  \ Jukka
CHARS" &#2019;" trac!  \ Jukka
CHARS" &#8220;" trac!  \ Jukka
CHARS" &#8221;" trac!  \ Jukka
CHARS" &#2022;" trac!  \ Jukka
CHARS" &#8211;" trac!  \ Jukka
CHARS" &#8212;" trac!  \ Jukka
CHAR ˜ S" &#732;" trac!  \ Jukka
CHARS" &#8482;" trac!  \ Jukka
CHAR š S" &#353;" trac!  \ Jukka
CHARS" &#8250;" trac!  \ Jukka
CHAR œ S" &#339;" trac!  \ Jukka
\ CHAR ž S" ..." trac!
CHAR Ÿ S" &#376;" trac!  \ Jukka
CHAR ¡ S" &#161;" trac!
CHAR ª S" &#170;" trac!
CHAR « S" &#171;" trac!
\ CHAR ¬ S" ..." trac!
CHAR · S" &#183;" trac!
CHAR º S" &#186;" trac!
CHAR » S" &#187;" trac!
CHAR ¿ S" &#191;" trac!
CHAR À S" &#192;" trac!
CHAR Á S" &#193;" trac!
CHAR Ç S" &#199;" trac!
CHAR È S" &#200;" trac!
CHAR É S" &#201;" trac!
CHAR Í S" &#205;" trac!
CHAR Ñ S" &#209;" trac!
CHAR Ó S" &#211;" trac!
CHAR Ú S" &#218;" trac!
CHAR Ü S" &#220;" trac!
CHAR à S" &#224;" trac!
CHAR á S" &#225;" trac!
CHAR ä S" &#228;" trac!
CHAR ç S" &#231;" trac!
CHAR è S" &#232;" trac!
CHAR é S" &#233;" trac!
CHAR í S" &#237;" trac!
CHAR ñ S" &#241;" trac!
CHAR ó S" &#243;" trac!
CHAR ô S" &#244;" trac!
CHAR ú S" &#250;" trac!
CHAR ü S" &#252;" trac!

.(  ftrac-utf8 ok!)

ftrac-utf8es.fs

\ ****************************
CR .( ftrac-utf8es )
\ ****************************

\ Copyright (C) 2005 Marcos Cruz (programandala.net)
\ License: http://programandala.net/license

\ Translation table for ftra:
\ Translation from the Windows character set (windows-1252;  Unofficial synonyms include cp-1252 and WinLatin1.) into numeric HTML Unicode.
\ Sorted to translate Spanish texts more quickly.

\ Special translations provided by Jukka "Yucca" Korpela in his web page.

[undefined] ftrac [IF]
  include ftrac.f
[THEN]

trac-array: trac-utf8es

\ The most used, Spanish chars

CHAR á S" &#225;" trac!
CHAR é S" &#233;" trac!
CHAR í S" &#237;" trac!
CHAR ó S" &#243;" trac!
CHAR ú S" &#250;" trac!
CHAR ñ S" &#241;" trac!
CHAR Á S" &#193;" trac!
CHAR É S" &#201;" trac!
CHAR Í S" &#205;" trac!
CHAR Ó S" &#211;" trac!
CHAR Ú S" &#218;" trac!
CHAR Ñ S" &#209;" trac!
CHAR ¿ S" &#191;" trac!
CHAR ¡ S" &#161;" trac!
CHAR ü S" &#252;" trac!
CHAR Ü S" &#220;" trac!

\ Some tipographical chars

CHAR « S" &#171;" trac!
CHAR · S" &#183;" trac!
CHAR ª S" &#170;" trac!
CHAR º S" &#186;" trac!
CHAR » S" &#187;" trac!

\ Chars from other languages

CHAR à S" &#224;" trac!
CHAR ä S" &#228;" trac!
CHAR ç S" &#231;" trac!
CHAR À S" &#192;" trac!
CHAR Ç S" &#199;" trac!
CHAR È S" &#200;" trac!
CHAR è S" &#232;" trac!
CHAR ô S" &#244;" trac!
CHAR ò S" &#242;" trac!
CHAR Ò S" &#210;" trac!

\ Special chars

160 S" &nbsp;" trac!  \ 160

\ Problem chars (128-159)

false [IF]

\ Original translations for chars 128-159

CHARS" &#8218;" trac!  \ Jukka
CHAR ƒ S" &#402;" trac!  \ Jukka
CHARS" &#8222;" trac!  \ Jukka
CHARS" &#8230;" trac!  \ Jukka
CHARS" &#8224;" trac!  \ Jukka
CHARS" &#8225;" trac!  \ Jukka
CHAR ˆ S" &#710;" trac!  \ Jukka
CHARS" &#8240;" trac!  \ Jukka
CHAR Š S" &#352;" trac!  \ Jukka
CHARS" &#8249;" trac!  \ Jukka
CHAR ΠS" &#338;" trac!  \ Jukka
CHARS" &#8216;" trac!  \ Jukka
CHARS" &#2019;" trac!  \ Jukka
CHARS" &#8220;" trac!  \ Jukka
CHARS" &#8221;" trac!  \ Jukka
CHARS" &#2022;" trac!  \ Jukka
CHARS" &#8211;" trac!  \ Jukka
CHARS" &#8212;" trac!  \ Jukka
CHAR ˜ S" &#732;" trac!  \ Jukka
CHARS" &#8482;" trac!  \ Jukka
CHAR š S" &#353;" trac!  \ Jukka
CHARS" &#8250;" trac!  \ Jukka
CHAR œ S" &#339;" trac!  \ Jukka
CHAR Ÿ S" &#376;" trac!  \ Jukka

[ELSE]

\ 2006 04 03

\ Actual secure translations for chars 128-159

CHARS" EUR" trac!  \ 128
CHARS" &#39;" trac!  \ 130
CHARS" &#34;" trac!  \ 132
CHARS" &#60;" trac!  \ 139
CHAR ΠS" OE" trac!  \ 140
CHARS" &#39;" trac!  \ 145
CHARS" &#39;" trac!  \ 146
CHARS" &#34;" trac!  \ 147
CHARS" &#34;" trac!  \ 148
CHARS" &#183;" trac!  \ 149
CHARS" -" trac!  \ 150
CHARS"  - " trac!  \ 151
CHAR ˜ S" &#126;" trac!  \ 152
CHARS" (TM)" trac!  \ 153
CHARS" &#62;" trac!  \ 155
CHAR œ S" oe" trac!  \ 156

[THEN]

.(  ftrac-utf8es ok!)

ftrac-uri.fs

\ ****************************
CR .( ftrac-uri )
\ ****************************

\ Copyright (C) 2008 Marcos Cruz (programandala.net)
\ License: http://programandala.net/license

\ Translation table for ftrac:
\ Some translations to use URI as link parameteres,
\ for example in social bookmarking.

[undefined] ftrac [IF]
  include ftrac.f
[THEN]

\ Table to translate URIs

trac-array: trac-uri-EX

BL S" %20" trac!
CHAR " S" %22" trac!
CHAR / S" %2F" trac!
CHAR : S" %3A" trac!

\ Table to translate URIs passed as parameters inside other URIs

trac-array: trac-uri

BL S" %20" trac!
CHAR " S" %22" trac!
CHAR & S" %26" trac!
CHAR + S" %2B" trac!
CHAR / S" %2F" trac!
CHAR : S" %3A" trac!
CHAR = S" %3D" trac!
CHAR ? S" %3F" trac!

.(  ftrac-uri ok!)

Downloads

Related pages

ftra
Forth tool that translates substrings.