caseness
Description of the page content
Forth program to implement some words to change the case of characters and text strings.
The objective of this little program I wrote in Forth with Gforth is explained in its source code.
I wrote these Forth words as primitives in my Forth 5mx, and I needed them in high level version to run some programs of mine under other Forth systems.
Source code
CR .( caseness )
\ Copyright (C) 2007 Marcos Cruz (http://programandala.net)
\ License/Licencia/Permesilo: http://programandala.net/license
\ This Forth program implements the following common use words:
\ tolower toupper lower upper
\ My own Forth system, Forth 5mx, has those words as native,
\ but I needed a high level version
\ to use my Forth programs in other Forth systems that don't have them.
\ My approach is not perfect or complete (in fact it's too complex)
\ but it suited my needs:
\ a lookup array stores the "caseness" of every letter
\ and the change is done simply changing the 6th bit of the char.
\ In ISO-8859-1 and Windows-1252 this works for all letters
\ but the following pairs:
\ Y/y with diaeresis.
\ S/s with caron.
\ Z/z with caron.
\ OE/oe.
\ A much simpler approach would be to store the pair of every char
\ (or zero if it doesn't have one); then all chars could be treated.
\ This program works only with single byte encodings.
\ Written in ANS Forth.
\ 2007-12-08 First version.
\ 2007-12-15 Improved: two IF structures removed.
char u constant uppercase-id \ caseness id
char l constant lowercase-id \ caseness id
256 constant #chars
create caseness-array #chars chars allot
caseness-array #chars chars erase
: caseness@ ( c -- u ) caseness-array + c@ ;
: caseness! ( u c -- ) caseness-array + c! ;
: all-caseness! ( u1 u2 u3 -- )
\ Store a caseness id into several char positions in the caseness array.
\ u1 = caseness id
\ u2 = first char
\ u3 = last char
\ 2007-12-08
1+ swap do
dup i caseness!
loop drop
;
\ Init all the caseness:
uppercase-id char A char Z all-caseness!
lowercase-id char a char z all-caseness!
uppercase-id 192 222 all-caseness! \ from "A grave" to uppercase thorn
lowercase-id 224 254 all-caseness! \ from "a grave" to lowercase thorn
false [if] \ obsolete, not used
: change-case ( c1 -- c2 )
\ Change the case of a char.
\ 2007-12-08
32 xor
;
[then]
: ?change-case ( c1 f -- c2 )
\ Change the case of a char if needed.
\ Only some chars can not be translated this way.
\ c1 = char to change
\ f = change char?
\ c2 = char changed if needed
\ 2007-12-15
32 and xor
;
[undefined] tolower [if]
: uppercase? ( u -- flag ) caseness@ uppercase-id = ;
: tolower ( c1 -- c2 | c1 )
\ Change a char to lower case if possible.
\ 2007-12-08 First version.
\ dup uppercase? if change-case then
\ 2007-12-08 Second version, without control structures.
dup uppercase? ?change-case
;
[then]
[undefined] toupper [if]
: lowercase? ( u -- flag ) caseness@ lowercase-id = ;
: toupper ( c1 -- c2 | c1 )
\ Change a char to upper case if possible.
\ 2007-12-08 First version.
\ dup lowercase? if change-case then
\ 2007-12-15 Second version, without control structures.
dup lowercase? ?change-case
;
[then]
[undefined] lower [if]
: lower ( c-addr u -- )
\ Change the string to lower case.
\ 2007-12-08
over + swap do
i c@ tolower i c!
loop
;
[then]
[undefined] upper [if]
: upper ( c-addr u -- )
\ Change the string to upper case.
\ 2007-12-08
over + swap do
i c@ toupper i c!
loop
;
[then]
true [if] \ used for debugging
: .caseness-array ( -- )
\ Show the content of the caseness array.
\ Just for debugging.
\ 2007-12-08 First version.
\ 2010-01-25 Better printout.
#chars 0 do
i caseness@ ?dup if
i dup . [char] ( emit emit [char] ) emit [char] : emit emit [char] , emit space
then
loop
;
[then]
.( caseness ok!)