ILF Callable Unicode Engine Functions

ILF Callable Unicode Engine Functions
- Overview
- Test Plan
- Bugs

Overview

UC_STRCASECMP

Performs a case-insensitive comparison of two UTF-32 values, returning 0 if the strings should be considered equal, a negative number if the first value is "less than" the second value, and a positive number if the first value is "greater than" the second value.

Arguments:

result - shared uint32, set to comparison result as described above
left - UTF-32 string, left string value
right - UTF-32 string, right string value
status - shared uint32, set to ICU error code if comparison fails

Example:

      SET     TST WORK UCODE 64 X 16    001 =     ABC123
      SET     TST WORK UCODE 64 X 16    002 =     Abc1234
      *
      PASS    --- AI                        FIELD           SHARE? Y
      PASS    TST WORK UCODE 64 X 16    001 FIELD           SHARE? N
      PASS    TST WORK UCODE 64 X 16    002 FIELD           SHARE? N
      PASS    --- BI                        FIELD           SHARE? Y
      CALL        .UC_STRCASECMP            RESIDENT? N END? N FAIL 0
      *
      *       --- AI contains a negative number
      *

UC_LEN

Returns the number of characters (not bytes) in the given string, minus trailing spaces.

Arguments:

string - UTF-32 string, value to measure
length - length of string (performance enhancement to reduce time to determine string length)

NOTE: length is returned in --- RETURN CODE, the T/F flag returned by this call is meaningless

Example:

      SET     TST WORK UCODE 4096           =     ABC123XYZ
      PASS    TST WORK UCODE 4096           FIELD           SHARE? N
      PASS    --- XI                        FIELD           SHARE? N
      CALL        .UC_LEN                   RESIDENT? N END? N FAIL 0
      *
      *       --- RETURN CODE is set to 9
      *

UC_UCASE

Converts given UTF-32 string to upper-case according to the specified locale.

Arguments:

target - shared UTF-32 string, upper-case version is written here
source - UTF-32 string, string to convert to upper-case
locale - RAW string, locale name
status - shared uint32, set to ICU error code if conversion fails
length - length of source string (for performance)

NOTE: you may safely pass the same field for both target and source.

Example:

      SET     TST WORK UCODE 32             =     abc123XYZmnop
      PASS    TST WORK UCODE 4096           FIELD           SHARE? Y
      PASS    TST WORK UCODE 32             FIELD           SHARE? N
      PASS        en                        FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? N
      CALL        .UC_UCASE                 RESIDENT? N END? N FAIL 0
      *
      *       WORK UCODE 4096 contains ABC123XYZNMOP
      *

UC_LCASE

Converts given UTF-32 string to lower-case according to the specified locale.

Arguments:

target - shared UTF-32 string, lower-case version is written here
source - UTF-32 string, string to convert to lower-case
locale - RAW string, locale name
status - shared uint32, set to ICU error code if conversion fail
length - length of source string (for performance)

NOTE: you may safely pass the same field for both target and source.

Example:

      SET     TST WORK UCODE 32             =     abc123XYZmnop
      PASS    TST WORK UCODE 4096           FIELD           SHARE? Y
      PASS    TST WORK UCODE 32             FIELD           SHARE? N
      PASS        en                        FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? N
      CALL        .UC_LCASE                 RESIDENT? N END? N FAIL 0
      *
      *       WORK UCODE 4096 contains abc123xyzmnop
      *

UC_TCASE

Converts given UTF-32 string to title-case according to the specified locale.

Arguments:

target - shared UTF-32 string, title-case version is written here
source - UTF-32 string, string to convert to title-case
locale - RAW string, locale name
status - shared uint32, set to ICU error code if conversion fails
length - length of source string (for performance)

NOTE: you may safely pass the same field for both target and source,

Example:

      SET     TST WORK UCODE 32             =     This is a test
      PASS    TST WORK UCODE 4096           FIELD           SHARE? Y
      PASS    TST WORK UCODE 32             FIELD           SHARE? N
      PASS        en                        FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? N
      CALL        .UC_TCASE                 RESIDENT? N END? N FAIL 0
      *
      *       WORK UCODE 4096 contains This Is A Test
      *

UC_FROM_UCODE

Converts the given string from UTF-32 encoding to the specified encoding

Arguments:

target - shared RAW string, transcoded value is written here
source - UTF-32 string, specifies UTF32- string to transcode
encoding- RAW string, desired encoding
action - RAW string, action to take on conversion error (see below)
option - UTF32-string, escape type or substitution string (see below)
status - shared uint32, set to ICU error code if conversion fails
length - shared uint32, returns the length of the target field.

You can use "action" to control the behavior of this function if a conversion error occurs (in this function, an error occurs when a UTF-32 character cannot be transcoded into the specified encoding). "action" can be one of the following:

STOP - conversion stops on first error
SKIP - conversion skip offending character
SUBS - UC_FROM_UCODE() substitutes the string specified by "option" in place of the offending character (note: "option" must specify a UTF-32 string which can be converted into the specified encoding)
ESCAPE - offending character is escaped according to the (first character of the) value of "option", as shown below
- C - specifies C-style escaping (\uXXXX or \UXXXXXXXX)
- STYLE - specifies CSS2 escaping (\XXXXXX )
- JAVA - specifies Java escaping (\uXXXX)
- UNICODE - specifies Unicode escaping {U+XXXXX}
- DECIMAL - specifies XML decimal escaping ()
- X - specifies XML hex escaping ()

Example:

      SET     TST WORK UCODE 32             =     This is a test
      *
      PASS    --- TEMP 32                   FIELD           SHARE? Y
      PASS    TST WORK UCODE 32             FIELD           SHARE? Y
      PASS        UTF-8                     FIELD           SHARE? N
      PASS        ESCAPE                    FIELD           SHARE? N
      PASS        C                         FIELD           SHARE? N
      PASS    --- SI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? Y
      CALL        .UC_FROM_UCODE            RESIDENT? N END? N FAIL 0
      *
      *       --- TEMP 32 contains "This is a test" in UTF-8 encoding
      *

Note that your target field should be a RAW alpha 4 times the size (in characters) of your UTF-32 field.

UC_TO_UCODE

Converts a string from the specified encoding to UTF-32.

Arguments:

target - shared UTF-32 string, transcoded value is written here
source - RAW string, specifies to transcode
encoding- RAW string, desired encoding
action - RAW string, action to take on conversion error (see below)
option - UTF32-string, escape type or substitution string (see below)
status - shared uint32, set to ICU error code if conversion fails
length - shared uint32, returns the length of the target field.
source length - the length of the source text (performance boost)

You can use "action" to control the behavior of this function if a conversion error occurs (in this function, an error occurs when a codepage character cannot be transcoded into UTF-32). "action" can be one of the following:

STOP - conversion stops on first error
SKIP - conversion skip offending character
SUBS - UC_TO_UCODE() substitutes the string specified by "option" in place of the offending character
ESCAPE - offending character is escaped according to the (first character of the) value of "option", as shown below
- C - specifies C-style escaping (\xXXXX)
- STYLE - specifies CSS2 escaping (\XXXXXX )
- JAVA - specifies Java escaping (\uXXXX)
- UNICODE - specifies Unicode escaping (U+XXXXX)
- DECIMAL - specifies XML decimal escaping ()
- X - specifies XML hex escaping ()

Example:

      PASS    --- TEMP 32K                      FIELD           SHARE? Y
      PASS    TST WORK RAW 32K                  FIELD           SHARE? Y
      PASS        UTF-16LE                  FIELD           SHARE? N
      PASS        ESCAPE                    FIELD           SHARE? N
      PASS        DECIMAL                   FIELD           SHARE? N
      PASS    --- SI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? Y
      PASS    --- II                        FIELD           SHARE? N
      CALL        .UC_TO_UCODE              RESIDENT? N END? N FAIL 0
      *
      *      TEMP 32K contains UTF-32 version of TST WORK RAW 32K 
      *      (which was encoded in UTF-16LE form)

Note that your target field should be a UTF-32 field with the same number of characters as your source field. You can use a RAW alpha, but it must be 4 times the number of characters in the source field, plus 1, in bytes. For example, to convert TEMP 8k (8192 characters), your target RAW alpha field must be (4*8192)+1 = 32769 bytes or larger.

UC_ENUMERATE_CNV

THIS IS WRONG, see the 0-app routine .ENV GET ENCODINGS for current usage.

Returns the name of an encoding which can be specified calling UC_TO_UCODE or UT_FROM_UCODE. To obtain the name of each encoding supported by the ICU library, initialize a uint32 (---AI for example) to 0, call UC_ENUMERATE_CNV, save the name returned in the target argument, increment, the uint32 and repeat until the CALL statement sets the next T/F flag to F.

Arguments:

target - shared RAW string, the name of an encoding is written here
iterator - uint32, a number indicating which encoding to enumerate

Example:

      LABEL   :GET NEXT
      PASS    --- WORK RAW 132              FIELD           SHARE? Y
      PASS    --- AI                        FIELD           SHARE? N
      CALL        .UC_ENUMERATE_CNV         RESIDENT? N END? N FAIL 0
      *
      *       --- TEMP 80 contains the name of an encoding
      *
      COMPUTE --- AI                        +     1
T     GOTO    :GET NEXT

UC_CHAR_NAME

Returns the name of the first character in a UTF-32 string.

Arguments:

target - shared RAW string, the name of the character is written here
source - UTF32-string, specifies the character of interest
status - shared uint32, set to ICU error code if an error occurs

Example:

      SET     --- CI                        =     8364
      PASS    --- WORK RAW 30               FIELD           SHARE? Y
      PASS    --- CI                        FIELD           SHARE? Y
      PASS    --- AI                        FIELD           SHARE? Y
      CALL        .UC_CHAR_NAME             RESIDENT? N END? N FAIL 0
      *
      *       --- WORK RAW 30 contains "EURO SIGN"
      *

UC_CHAR_BY_NAME

Returns the character whose name is specified.

Arguments:

target - shared UTF-32 string, the requested character is written into the first character position (the rest of string is unchanged)
name - name of the character of interest
status - shared uint32, set to ICU error code if an error occurs.

Example:

      PASS    TST WORK UCODE 1              FIELD           SHARE? Y
      PASS        COMMERCIAL AT             FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      CALL        .UC_CHAR_BY_NAME          RESIDENT? N END? N FAIL 0
      *
      *       WORK UCODE 1 contains "@"
      *

UC_ERRORCODE

Returns a programmer-friendly interpretation of an ICU error code (note: this does not return a message suitable for an end-user).

Arguments:

target - shared RAW string, set to the text form of the given error code (U_ZERO_ERROR, U_BUFFER_OVERFLOW, ...)
status - uint32, a numeric error code returned by some other uc_xxx function

Example:

      PASS    TST WORK UCODE 1              FIELD           SHARE? Y
      PASS        SILLY NAME                FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      CALL        .UC_CHAR_BY_NAME          RESIDENT? N END? N FAIL 0
F     PASS    --- WORK RAW 32               FIELD           SHARE? Y
F     PASS    --- AI                        FIELD           SHARE? N
F     CALL        .UC_ERRORCODE             RESIDENT?N  END? N FAIL 0
*
*       --- WORK RAW 32 contains "U_ILLEGAL_CHAR_FOUND"
*

Test Plan

Do the above callable functions produce the results stated?

Start by using the example code and checking the results
Then try different values and/or data types to see if the correct result is produced in each case.

Bugs

Topic revision: r10 - 2018-01-03 - JeanNeron