ILF Callable Unicode Engine Functions
Overview
UC_STRCASECMP
Performs a case-insensitive comparison of two UTF-32 values, returning 0 if the strings should be considered equal, a negative number if the first value is "less than" the second value, and a positive number if the first value is "greater than" the second value.
Arguments:
- result - shared uint32, set to comparison result as described above
- left - UTF-32 string, left string value
- right - UTF-32 string, right string value
- status - shared uint32, set to ICU error code if comparison fails
Example:
SET TST WORK UCODE 64 X 16 001 = ABC123
SET TST WORK UCODE 64 X 16 002 = Abc1234
*
PASS --- AI FIELD SHARE? Y
PASS TST WORK UCODE 64 X 16 001 FIELD SHARE? N
PASS TST WORK UCODE 64 X 16 002 FIELD SHARE? N
PASS --- BI FIELD SHARE? Y
CALL .UC_STRCASECMP RESIDENT? N END? N FAIL 0
*
* --- AI contains a negative number
*
UC_LEN
Returns the number of characters (not bytes) in the given string, minus trailing spaces.
Arguments:
- string - UTF-32 string, value to measure
- length - length of string (performance enhancement to reduce time to determine string length)
NOTE: length is returned in --- RETURN CODE, the T/F flag returned by this call is meaningless
Example:
SET TST WORK UCODE 4096 = ABC123XYZ
PASS TST WORK UCODE 4096 FIELD SHARE? N
PASS --- XI FIELD SHARE? N
CALL .UC_LEN RESIDENT? N END? N FAIL 0
*
* --- RETURN CODE is set to 9
*
UC_UCASE
Converts given UTF-32 string to upper-case according to the specified locale.
Arguments:
- target - shared UTF-32 string, upper-case version is written here
- source - UTF-32 string, string to convert to upper-case
- locale - RAW string, locale name
- status - shared uint32, set to ICU error code if conversion fails
- length - length of source string (for performance)
NOTE: you may safely pass the same field for both target and source.
Example:
SET TST WORK UCODE 32 = abc123XYZmnop
PASS TST WORK UCODE 4096 FIELD SHARE? Y
PASS TST WORK UCODE 32 FIELD SHARE? N
PASS en FIELD SHARE? N
PASS --- AI FIELD SHARE? Y
PASS --- LI FIELD SHARE? N
CALL .UC_UCASE RESIDENT? N END? N FAIL 0
*
* WORK UCODE 4096 contains ABC123XYZNMOP
*
UC_LCASE
Converts given UTF-32 string to lower-case according to the specified locale.
Arguments:
- target - shared UTF-32 string, lower-case version is written here
- source - UTF-32 string, string to convert to lower-case
- locale - RAW string, locale name
- status - shared uint32, set to ICU error code if conversion fail
- length - length of source string (for performance)
NOTE: you may safely pass the same field for both target and source.
Example:
SET TST WORK UCODE 32 = abc123XYZmnop
PASS TST WORK UCODE 4096 FIELD SHARE? Y
PASS TST WORK UCODE 32 FIELD SHARE? N
PASS en FIELD SHARE? N
PASS --- AI FIELD SHARE? Y
PASS --- LI FIELD SHARE? N
CALL .UC_LCASE RESIDENT? N END? N FAIL 0
*
* WORK UCODE 4096 contains abc123xyzmnop
*
UC_TCASE
Converts given UTF-32 string to title-case according to the specified locale.
Arguments:
- target - shared UTF-32 string, title-case version is written here
- source - UTF-32 string, string to convert to title-case
- locale - RAW string, locale name
- status - shared uint32, set to ICU error code if conversion fails
- length - length of source string (for performance)
NOTE: you may safely pass the same field for both target and source,
Example:
SET TST WORK UCODE 32 = This is a test
PASS TST WORK UCODE 4096 FIELD SHARE? Y
PASS TST WORK UCODE 32 FIELD SHARE? N
PASS en FIELD SHARE? N
PASS --- AI FIELD SHARE? Y
PASS --- LI FIELD SHARE? N
CALL .UC_TCASE RESIDENT? N END? N FAIL 0
*
* WORK UCODE 4096 contains This Is A Test
*
UC_FROM_UCODE
Converts the given string from UTF-32 encoding to the specified encoding
Arguments:
- target - shared RAW string, transcoded value is written here
- source - UTF-32 string, specifies UTF32- string to transcode
- encoding- RAW string, desired encoding
- action - RAW string, action to take on conversion error (see below)
- option - UTF32-string, escape type or substitution string (see below)
- status - shared uint32, set to ICU error code if conversion fails
- length - shared uint32, returns the length of the target field.
You can use "action" to control the behavior of this function if a conversion error occurs (in this function, an error occurs when a UTF-32 character cannot be transcoded into the specified encoding). "action" can be one of the following:
- STOP - conversion stops on first error
- SKIP - conversion skip offending character
- SUBS - UC_FROM_UCODE() substitutes the string specified by "option" in place of the offending character (note: "option" must specify a UTF-32 string which can be converted into the specified encoding)
- ESCAPE - offending character is escaped according to the (first character of the) value of "option", as shown below
- C - specifies C-style escaping (\uXXXX or \UXXXXXXXX)
- STYLE - specifies CSS2 escaping (\XXXXXX )
- JAVA - specifies Java escaping (\uXXXX)
- UNICODE - specifies Unicode escaping {U+XXXXX}
- DECIMAL - specifies XML decimal escaping ()
- X - specifies XML hex escaping ()
Example:
SET TST WORK UCODE 32 = This is a test
*
PASS --- TEMP 32 FIELD SHARE? Y
PASS TST WORK UCODE 32 FIELD SHARE? Y
PASS UTF-8 FIELD SHARE? N
PASS ESCAPE FIELD SHARE? N
PASS C FIELD SHARE? N
PASS --- SI FIELD SHARE? Y
PASS --- LI FIELD SHARE? Y
CALL .UC_FROM_UCODE RESIDENT? N END? N FAIL 0
*
* --- TEMP 32 contains "This is a test" in UTF-8 encoding
*
Note that your target field should be a RAW alpha 4 times the size (in characters) of your UTF-32 field.
UC_TO_UCODE
Converts a string from the specified encoding to UTF-32.
Arguments:
- target - shared UTF-32 string, transcoded value is written here
- source - RAW string, specifies to transcode
- encoding- RAW string, desired encoding
- action - RAW string, action to take on conversion error (see below)
- option - UTF32-string, escape type or substitution string (see below)
- status - shared uint32, set to ICU error code if conversion fails
- length - shared uint32, returns the length of the target field.
- source length - the length of the source text (performance boost)
You can use "action" to control the behavior of this function if a conversion error occurs (in this function, an error occurs when a codepage character cannot be transcoded into UTF-32). "action" can be one of the following:
- STOP - conversion stops on first error
- SKIP - conversion skip offending character
- SUBS - UC_TO_UCODE() substitutes the string specified by "option" in place of the offending character
- ESCAPE - offending character is escaped according to the (first character of the) value of "option", as shown below
- C - specifies C-style escaping (\xXXXX)
- STYLE - specifies CSS2 escaping (\XXXXXX )
- JAVA - specifies Java escaping (\uXXXX)
- UNICODE - specifies Unicode escaping (U+XXXXX)
- DECIMAL - specifies XML decimal escaping ()
- X - specifies XML hex escaping ()
-
Example:
PASS --- TEMP 32K FIELD SHARE? Y
PASS TST WORK RAW 32K FIELD SHARE? Y
PASS UTF-16LE FIELD SHARE? N
PASS ESCAPE FIELD SHARE? N
PASS DECIMAL FIELD SHARE? N
PASS --- SI FIELD SHARE? Y
PASS --- LI FIELD SHARE? Y
PASS --- II FIELD SHARE? N
CALL .UC_TO_UCODE RESIDENT? N END? N FAIL 0
*
* TEMP 32K contains UTF-32 version of TST WORK RAW 32K
* (which was encoded in UTF-16LE form)
Note that your target field should be a UTF-32 field with the same number of
characters as your source field. You can use a RAW alpha, but it must be 4 times the number of characters in the source field, plus 1, in bytes. For example, to convert TEMP 8k (8192 characters), your target RAW alpha field must be (4*8192)+1 = 32769 bytes or larger.
UC_ENUMERATE_CNV
THIS IS WRONG, see the 0-app routine .ENV GET ENCODINGS for current usage.
Returns the name of an encoding which can be specified calling UC_TO_UCODE or UT_FROM_UCODE. To obtain the name of each encoding supported by the ICU library, initialize a uint32 (---AI for example) to 0, call UC_ENUMERATE_CNV, save the name returned in the target argument, increment, the uint32 and repeat until the CALL statement sets the next T/F flag to F.
Arguments:
- target - shared RAW string, the name of an encoding is written here
- iterator - uint32, a number indicating which encoding to enumerate
Example:
LABEL :GET NEXT
PASS --- WORK RAW 132 FIELD SHARE? Y
PASS --- AI FIELD SHARE? N
CALL .UC_ENUMERATE_CNV RESIDENT? N END? N FAIL 0
*
* --- TEMP 80 contains the name of an encoding
*
COMPUTE --- AI + 1
T GOTO :GET NEXT
UC_CHAR_NAME
Returns the name of the first character in a UTF-32 string.
Arguments:
- target - shared RAW string, the name of the character is written here
- source - UTF32-string, specifies the character of interest
- status - shared uint32, set to ICU error code if an error occurs
Example:
SET --- CI = 8364
PASS --- WORK RAW 30 FIELD SHARE? Y
PASS --- CI FIELD SHARE? Y
PASS --- AI FIELD SHARE? Y
CALL .UC_CHAR_NAME RESIDENT? N END? N FAIL 0
*
* --- WORK RAW 30 contains "EURO SIGN"
*
UC_CHAR_BY_NAME
Returns the character whose name is specified.
Arguments:
- target - shared UTF-32 string, the requested character is written into the first character position (the rest of string is unchanged)
- name - name of the character of interest
- status - shared uint32, set to ICU error code if an error occurs.
Example:
PASS TST WORK UCODE 1 FIELD SHARE? Y
PASS COMMERCIAL AT FIELD SHARE? N
PASS --- AI FIELD SHARE? Y
CALL .UC_CHAR_BY_NAME RESIDENT? N END? N FAIL 0
*
* WORK UCODE 1 contains "@"
*
UC_ERRORCODE
Returns a programmer-friendly interpretation of an ICU error code (note: this does not return a message suitable for an end-user).
Arguments:
- target - shared RAW string, set to the text form of the given error code (U_ZERO_ERROR, U_BUFFER_OVERFLOW, ...)
- status - uint32, a numeric error code returned by some other uc_xxx function
Example:
PASS TST WORK UCODE 1 FIELD SHARE? Y
PASS SILLY NAME FIELD SHARE? N
PASS --- AI FIELD SHARE? Y
CALL .UC_CHAR_BY_NAME RESIDENT? N END? N FAIL 0
F PASS --- WORK RAW 32 FIELD SHARE? Y
F PASS --- AI FIELD SHARE? N
F CALL .UC_ERRORCODE RESIDENT?N END? N FAIL 0
*
* --- WORK RAW 32 contains "U_ILLEGAL_CHAR_FOUND"
*
Test Plan
Do the above callable functions produce the results stated?
- Start by using the example code and checking the results
- Then try different values and/or data types to see if the correct result is produced in each case.
Bugs