---+ ILF Callable Unicode Engine Functions %TOC% ---++ Overview ---+++ *UC_STRCASECMP* <br />Performs a case-insensitive comparison of two UTF-32 values, returning 0 if the strings should be considered equal, a negative number if the first value is "less than" the second value, and a positive number if the first value is "greater than" the second value. Arguments: * result - shared uint32, set to comparison result as described above * left - UTF-32 string, left string value * right - UTF-32 string, right string value * status - shared uint32, set to ICU error code if comparison fails Example: <pre> SET TST WORK UCODE 64 X 16 001 = ABC123 SET TST WORK UCODE 64 X 16 002 = Abc1234 * PASS --- AI FIELD SHARE? Y PASS TST WORK UCODE 64 X 16 001 FIELD SHARE? N PASS TST WORK UCODE 64 X 16 002 FIELD SHARE? N PASS --- BI FIELD SHARE? Y CALL .UC_STRCASECMP RESIDENT? N END? N FAIL 0 * * --- AI contains a negative number * </pre> ---+++ *UC_LEN* <br />Returns the number of characters (not bytes) in the given string, minus trailing spaces. Arguments: * string - UTF-32 string, value to measure * length - length of string (performance enhancement to reduce time to determine string length) NOTE: length is returned in --- RETURN CODE, the T/F flag returned by this call is meaningless Example: <pre> SET TST WORK UCODE 4096 = ABC123XYZ PASS TST WORK UCODE 4096 FIELD SHARE? N PASS --- XI FIELD SHARE? N CALL .UC_LEN RESIDENT? N END? N FAIL 0 * * --- RETURN CODE is set to 9 * </pre> ---+++ *UC_UCASE* <br />Converts given UTF-32 string to upper-case according to the specified locale. Arguments: * target - shared UTF-32 string, upper-case version is written here * source - UTF-32 string, string to convert to upper-case * locale - RAW string, locale name * status - shared uint32, set to ICU error code if conversion fails * length - length of source string (for performance) NOTE: you may safely pass the same field for both target and source. Example: <pre> SET TST WORK UCODE 32 = abc123XYZmnop PASS TST WORK UCODE 4096 FIELD SHARE? Y PASS TST WORK UCODE 32 FIELD SHARE? N PASS en FIELD SHARE? N PASS --- AI FIELD SHARE? Y PASS --- LI FIELD SHARE? N CALL .UC_UCASE RESIDENT? N END? N FAIL 0 * * WORK UCODE 4096 contains ABC123XYZNMOP * </pre> ---+++ *UC_LCASE* <br />Converts given UTF-32 string to lower-case according to the specified locale. Arguments: * target - shared UTF-32 string, lower-case version is written here * source - UTF-32 string, string to convert to lower-case * locale - RAW string, locale name * status - shared uint32, set to ICU error code if conversion fail * length - length of source string (for performance) NOTE: you may safely pass the same field for both target and source. Example: <pre> SET TST WORK UCODE 32 = abc123XYZmnop PASS TST WORK UCODE 4096 FIELD SHARE? Y PASS TST WORK UCODE 32 FIELD SHARE? N PASS en FIELD SHARE? N PASS --- AI FIELD SHARE? Y PASS --- LI FIELD SHARE? N CALL .UC_LCASE RESIDENT? N END? N FAIL 0 * * WORK UCODE 4096 contains abc123xyzmnop * </pre> ---+++ *UC_TCASE* <br />Converts given UTF-32 string to title-case according to the specified locale. Arguments: * target - shared UTF-32 string, title-case version is written here * source - UTF-32 string, string to convert to title-case * locale - RAW string, locale name * status - shared uint32, set to ICU error code if conversion fails * length - length of source string (for performance) NOTE: you may safely pass the same field for both target and source, Example: <pre> SET TST WORK UCODE 32 = This is a test PASS TST WORK UCODE 4096 FIELD SHARE? Y PASS TST WORK UCODE 32 FIELD SHARE? N PASS en FIELD SHARE? N PASS --- AI FIELD SHARE? Y PASS --- LI FIELD SHARE? N CALL .UC_TCASE RESIDENT? N END? N FAIL 0 * * WORK UCODE 4096 contains This Is A Test * </pre> ---+++ *UC_FROM_UCODE* <br />Converts the given string from UTF-32 encoding to the specified encoding Arguments: * target - shared RAW string, transcoded value is written here * source - UTF-32 string, specifies UTF32- string to transcode * encoding- RAW string, desired encoding * action - RAW string, action to take on conversion error (see below) * option - UTF32-string, escape type or substitution string (see below) * status - shared uint32, set to ICU error code if conversion fails * length - shared uint32, returns the length of the target field. You can use "action" to control the behavior of this function if a conversion error occurs (in this function, an error occurs when a UTF-32 character cannot be transcoded into the specified encoding). "action" can be one of the following: * STOP - conversion stops on first error * SKIP - conversion skip offending character * SUBS - UC_FROM_UCODE() substitutes the string specified by "option" in place of the offending character (note: "option" must specify a UTF-32 string which can be converted into the specified encoding) * ESCAPE - offending character is escaped according to the (first character of the) value of "option", as shown below * C - specifies C-style escaping (\uXXXX or \UXXXXXXXX) * STYLE - specifies CSS2 escaping (\XXXXXX ) * JAVA - specifies Java escaping (\uXXXX) * UNICODE - specifies Unicode escaping {U+XXXXX} * DECIMAL - specifies XML decimal escaping () * X - specifies XML hex escaping () Example: <pre> SET TST WORK UCODE 32 = This is a test * PASS --- TEMP 32 FIELD SHARE? Y PASS TST WORK UCODE 32 FIELD SHARE? Y PASS UTF-8 FIELD SHARE? N PASS ESCAPE FIELD SHARE? N PASS C FIELD SHARE? N PASS --- SI FIELD SHARE? Y PASS --- LI FIELD SHARE? Y CALL .UC_FROM_UCODE RESIDENT? N END? N FAIL 0 * * --- TEMP 32 contains "This is a test" in UTF-8 encoding * </pre> Note that your target field should be a RAW alpha 4 times the size (in characters) of your UTF-32 field. ---+++ *UC_TO_UCODE* <br />Converts a string from the specified encoding to UTF-32. Arguments: * target - shared UTF-32 string, transcoded value is written here * source - RAW string, specifies to transcode * encoding- RAW string, desired encoding * action - RAW string, action to take on conversion error (see below) * option - UTF32-string, escape type or substitution string (see below) * status - shared uint32, set to ICU error code if conversion fails * length - shared uint32, returns the length of the target field. * source length - the length of the source text (performance boost) You can use "action" to control the behavior of this function if a conversion error occurs (in this function, an error occurs when a codepage character cannot be transcoded into UTF-32). "action" can be one of the following: * STOP - conversion stops on first error * SKIP - conversion skip offending character * SUBS - UC_TO_UCODE() substitutes the string specified by "option" in place of the offending character * ESCAPE - offending character is escaped according to the (first character of the) value of "option", as shown below * C - specifies C-style escaping (\xXXXX) * STYLE - specifies CSS2 escaping (\XXXXXX ) * JAVA - specifies Java escaping (\uXXXX) * UNICODE - specifies Unicode escaping (U+XXXXX) * DECIMAL - specifies XML decimal escaping () * X - specifies XML hex escaping () * Example: <pre> PASS --- TEMP 32K FIELD SHARE? Y PASS TST WORK RAW 32K FIELD SHARE? Y PASS UTF-16LE FIELD SHARE? N PASS ESCAPE FIELD SHARE? N PASS DECIMAL FIELD SHARE? N PASS --- SI FIELD SHARE? Y PASS --- LI FIELD SHARE? Y PASS --- II FIELD SHARE? N CALL .UC_TO_UCODE RESIDENT? N END? N FAIL 0 * * TEMP 32K contains UTF-32 version of TST WORK RAW 32K * (which was encoded in UTF-16LE form) </pre> Note that your target field should be a UTF-32 field with the same number of *characters* as your source field. You can use a RAW alpha, but it must be 4 times the number of characters in the source field, plus 1, in bytes. For example, to convert TEMP 8k (8192 characters), your target RAW alpha field must be (4*8192)+1 = 32769 bytes or larger. ---+++ *UC_ENUMERATE_CNV* *THIS IS WRONG, see the 0-app routine .ENV GET ENCODINGS for current usage.* <br />Returns the name of an encoding which can be specified calling UC_TO_UCODE or UT_FROM_UCODE. To obtain the name of each encoding supported by the ICU library, initialize a uint32 (---AI for example) to 0, call UC_ENUMERATE_CNV, save the name returned in the target argument, increment, the uint32 and repeat until the CALL statement sets the next T/F flag to F. Arguments: * target - shared RAW string, the name of an encoding is written here * iterator - uint32, a number indicating which encoding to enumerate Example: <pre> LABEL :GET NEXT PASS --- WORK RAW 132 FIELD SHARE? Y PASS --- AI FIELD SHARE? N CALL .UC_ENUMERATE_CNV RESIDENT? N END? N FAIL 0 * * --- TEMP 80 contains the name of an encoding * COMPUTE --- AI + 1 T GOTO :GET NEXT </pre> ---+++ *UC_CHAR_NAME* <br />Returns the name of the first character in a UTF-32 string. Arguments: * target - shared RAW string, the name of the character is written here * source - UTF32-string, specifies the character of interest * status - shared uint32, set to ICU error code if an error occurs Example: <pre> SET --- CI = 8364 PASS --- WORK RAW 30 FIELD SHARE? Y PASS --- CI FIELD SHARE? Y PASS --- AI FIELD SHARE? Y CALL .UC_CHAR_NAME RESIDENT? N END? N FAIL 0 * * --- WORK RAW 30 contains "EURO SIGN" * </pre> ---+++ *UC_CHAR_BY_NAME* <br />Returns the character whose name is specified. Arguments: * target - shared UTF-32 string, the requested character is written into the first character position (the rest of string is unchanged) * name - name of the character of interest * status - shared uint32, set to ICU error code if an error occurs. Example: <pre> PASS TST WORK UCODE 1 FIELD SHARE? Y PASS COMMERCIAL AT FIELD SHARE? N PASS --- AI FIELD SHARE? Y CALL .UC_CHAR_BY_NAME RESIDENT? N END? N FAIL 0 * * WORK UCODE 1 contains "@" * </pre> ---+++ *UC_ERRORCODE* <br />Returns a programmer-friendly interpretation of an ICU error code (note: this does not return a message suitable for an end-user). Arguments: * target - shared RAW string, set to the text form of the given error code (U_ZERO_ERROR, U_BUFFER_OVERFLOW, ...) * status - uint32, a numeric error code returned by some other uc_xxx function Example: <pre> PASS TST WORK UCODE 1 FIELD SHARE? Y PASS SILLY NAME FIELD SHARE? N PASS --- AI FIELD SHARE? Y CALL .UC_CHAR_BY_NAME RESIDENT? N END? N FAIL 0 F PASS --- WORK RAW 32 FIELD SHARE? Y F PASS --- AI FIELD SHARE? N F CALL .UC_ERRORCODE RESIDENT?N END? N FAIL 0 * * --- WORK RAW 32 contains "U_ILLEGAL_CHAR_FOUND" * </pre> ---++ Test Plan Do the above callable functions produce the results stated? * Start by using the example code and checking the results * Then try different values and/or data types to see if the correct result is produced in each case. ---++ Bugs
This topic: Main
>
UnicodeTranscodingFn
Topic revision: r10 - 2018-01-03 - JeanNeron
Copyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback