caseness

Description of the page content

Forth program to implement some words to change the case of characters and text strings.

Tags:

The objective of this little program I wrote in Forth with Gforth is explained in its source code.

I wrote these Forth words as primitives in my Forth 5mx, and I needed them in high level version to run some programs of mine under other Forth systems.

Source code

CR .( caseness )

\ Copyright (C) 2007 Marcos Cruz (http://programandala.net)
\ License/Licencia/Permesilo: http://programandala.net/license

\ This Forth program implements the following common use words:
\   tolower toupper lower upper

\ My own Forth system, Forth 5mx, has those words as native,
\ but I needed a high level version
\ to use my Forth programs in other Forth systems that don't have them.

\ My approach is not perfect or complete (in fact it's too complex) 
\ but it suited my needs:
\ a lookup array stores the "caseness" of every letter
\ and the change is done simply changing the 6th bit of the char.
\ In ISO-8859-1 and Windows-1252 this works for all letters 
\ but the following pairs:
\   Y/y with diaeresis.
\   S/s with caron.
\   Z/z with caron.
\   OE/oe.
\ A much simpler approach would be to store the pair of every char 
\ (or zero if it doesn't have one); then all chars could be treated.
\ This program works only with single byte encodings.

\ Written in ANS Forth. 

\ 2007-12-08 First version.
\ 2007-12-15 Improved: two IF structures removed.

char u constant uppercase-id  \ caseness id
char l constant lowercase-id  \ caseness id
256 constant #chars

create caseness-array #chars chars allot
caseness-array #chars chars erase

: caseness@  ( c -- u ) caseness-array + c@  ;
: caseness!  ( u c -- )  caseness-array + c!  ;

: all-caseness!  ( u1 u2 u3 -- )

	\ Store a caseness id into several char positions in the caseness array.

	\ u1 = caseness id
	\ u2 = first char
	\ u3 = last char

	\ 2007-12-08

	1+ swap  do
		dup i caseness!
	loop  drop

	;

\ Init all the caseness:
uppercase-id char A char Z all-caseness!
lowercase-id char a char z all-caseness!
uppercase-id 192 222 all-caseness!  \ from "A grave" to uppercase thorn
lowercase-id 224 254 all-caseness!  \ from "a grave" to lowercase thorn

false [if]  \ obsolete, not used

: change-case  ( c1 -- c2 )

	\ Change the case of a char.
	
	\ 2007-12-08

	32 xor
	
	;
	
[then]

: ?change-case  ( c1 f -- c2 )

	\ Change the case of a char if needed.
	\ Only some chars can not be translated this way.
	
	\ c1 = char to change
	\ f = change char?
	\ c2 = char changed if needed
	
	\ 2007-12-15

	32 and xor
	
	;

[undefined] tolower [if]

: uppercase?  ( u -- flag )  caseness@ uppercase-id =  ;

: tolower  ( c1 -- c2 | c1 )

	\ Change a char to lower case if possible.
	
	\ 2007-12-08 First version.
	\ dup uppercase? if  change-case  then
	
	\ 2007-12-08 Second version, without control structures.
	dup uppercase? ?change-case

	;

[then]

[undefined] toupper [if]

: lowercase?  ( u -- flag )  caseness@ lowercase-id =  ;

: toupper  ( c1 -- c2 | c1 )

	\ Change a char to upper case if possible.

	\ 2007-12-08 First version.
	\ dup lowercase? if  change-case  then

	\ 2007-12-15 Second version, without control structures.
	dup lowercase? ?change-case

	;

[then]

[undefined] lower [if]

: lower  ( c-addr u -- )

	\ Change the string to lower case.
	
	\ 2007-12-08

	over + swap  do
		i c@ tolower i c!
	loop

	;

[then]

[undefined] upper [if]

: upper  ( c-addr u -- )

	\ Change the string to upper case.

	\ 2007-12-08
	
	over + swap  do
		i c@ toupper i c!
	loop

	;

[then]

true [if]  \ used for debugging

: .caseness-array  ( -- )

	\ Show the content of the caseness array.
	\ Just for debugging.
	
	\ 2007-12-08 First version.
	\ 2010-01-25 Better printout.

	#chars 0  do
		i caseness@ ?dup  if
			i dup . [char] ( emit emit [char] ) emit [char] : emit emit [char] , emit space 
		then
	loop
	
	;

[then]

.( caseness ok!)

Downloads