caseness

Description of the page content

Forth program to implement some words to change the case of characters and text strings.

Tags:

The objective of this little program I wrote in Forth with Gforth is explained in its source code.

I wrote these Forth words as primitives in my Forth 5mx, and I needed them in high level version to run some programs of mine under other Forth systems.

Source code

CR .( caseness )

\ Copyright (C) 2007 Marcos Cruz (http://programandala.net)
\ License/Licencia/Permesilo: http://programandala.net/license

\ This Forth program implements the following common use words:
\   tolower toupper lower upper

\ My own Forth system, Forth 5mx, has those words as native,
\ but I needed a high level version
\ to use my Forth programs in other Forth systems that don't have them.

\ My approach is not perfect or complete (in fact it's too complex) 
\ but it suited my needs:
\ a lookup array stores the "caseness" of every letter
\ and the change is done simply changing the 6th bit of the char.
\ In ISO-8859-1 and Windows-1252 this works for all letters 
\ but the following pairs:
\   Y/y with diaeresis.
\   S/s with caron.
\   Z/z with caron.
\   OE/oe.
\ A much simpler approach would be to store the pair of every char 
\ (or zero if it doesn't have one); then all chars could be treated.
\ This program works only with single byte encodings.

\ Written in ANS Forth. 

\ 2007-12-08 First version.
\ 2007-12-15 Improved: two IF structures removed.

char u constant uppercase-id  \ caseness id
char l constant lowercase-id  \ caseness id
256 constant #chars

create caseness-array #chars chars allot
caseness-array #chars chars erase

: caseness@  ( c -- u ) caseness-array + c@  ;
: caseness!  ( u c -- )  caseness-array + c!  ;

: all-caseness!  ( u1 u2 u3 -- )

  \ Store a caseness id into several char positions in the caseness array.

  \ u1 = caseness id
  \ u2 = first char
  \ u3 = last char

  \ 2007-12-08

  1+ swap  do
    dup i caseness!
  loop  drop

  ;

\ Init all the caseness:
uppercase-id char A char Z all-caseness!
lowercase-id char a char z all-caseness!
uppercase-id 192 222 all-caseness!  \ from "A grave" to uppercase thorn
lowercase-id 224 254 all-caseness!  \ from "a grave" to lowercase thorn

false [if]  \ obsolete, not used

: change-case  ( c1 -- c2 )

  \ Change the case of a char.

  \ 2007-12-08

  32 xor

  ;

[then]

: ?change-case  ( c1 f -- c2 )

  \ Change the case of a char if needed.
  \ Only some chars can not be translated this way.

  \ c1 = char to change
  \ f = change char?
  \ c2 = char changed if needed

  \ 2007-12-15

  32 and xor

  ;

[undefined] tolower [if]

: uppercase?  ( u -- flag )  caseness@ uppercase-id =  ;

: tolower  ( c1 -- c2 | c1 )

  \ Change a char to lower case if possible.

  \ 2007-12-08 First version.
  \ dup uppercase? if  change-case  then

  \ 2007-12-08 Second version, without control structures.
  dup uppercase? ?change-case

  ;

[then]

[undefined] toupper [if]

: lowercase?  ( u -- flag )  caseness@ lowercase-id =  ;

: toupper  ( c1 -- c2 | c1 )

  \ Change a char to upper case if possible.

  \ 2007-12-08 First version.
  \ dup lowercase? if  change-case  then

  \ 2007-12-15 Second version, without control structures.
  dup lowercase? ?change-case

  ;

[then]

[undefined] lower [if]

: lower  ( c-addr u -- )

  \ Change the string to lower case.

  \ 2007-12-08

  over + swap  do
    i c@ tolower i c!
  loop

  ;

[then]

[undefined] upper [if]

: upper  ( c-addr u -- )

  \ Change the string to upper case.

  \ 2007-12-08

  over + swap  do
    i c@ toupper i c!
  loop

  ;

[then]

true [if]  \ used for debugging

: .caseness-array  ( -- )

  \ Show the content of the caseness array.
  \ Just for debugging.

  \ 2007-12-08 First version.
  \ 2010-01-25 Better printout.

  #chars 0  do
    i caseness@ ?dup  if
      i dup . [char] ( emit emit [char] ) emit [char] : emit emit [char] , emit space
    then
  loop

  ;

[then]

.( caseness ok!)

Downloads