UnicodeTranscodingFn < Main

---+ ILF Callable Unicode Engine Functions

%TOC%
---++ Overview
---+++ *UC_STRCASECMP*

<br />Performs a case-insensitive comparison of two UTF-32 values, returning 0 if the strings should be considered equal, a negative number if the first value is "less than" the second value, and a positive number if the first value is "greater than" the second value.

Arguments:
   * result - shared uint32, set to comparison result as described above
   * left - UTF-32 string, left string value
   * right - UTF-32 string, right string value
   * status - shared uint32, set to ICU error code if comparison fails
Example:

<pre>      SET     TST WORK UCODE 64 X 16    001 =     ABC123
      SET     TST WORK UCODE 64 X 16    002 =     Abc1234
      *
      PASS    --- AI                        FIELD           SHARE? Y
      PASS    TST WORK UCODE 64 X 16    001 FIELD           SHARE? N
      PASS    TST WORK UCODE 64 X 16    002 FIELD           SHARE? N
      PASS    --- BI                        FIELD           SHARE? Y
      CALL        .UC_STRCASECMP            RESIDENT? N END? N FAIL 0
      *
      *       --- AI contains a negative number
      *
</pre>

---+++ *UC_LEN*

<br />Returns the number of characters (not bytes) in the given string, minus trailing spaces.

Arguments:
   * string - UTF-32 string, value to measure
   * length - length of string (performance enhancement to reduce time to determine string length)
NOTE: length is returned in --- RETURN CODE, the T/F flag returned by this call is meaningless

Example:

<pre>      SET     TST WORK UCODE 4096           =     ABC123XYZ
      PASS    TST WORK UCODE 4096           FIELD           SHARE? N
      PASS    --- XI                        FIELD           SHARE? N
      CALL        .UC_LEN                   RESIDENT? N END? N FAIL 0
      *
      *       --- RETURN CODE is set to 9
      *
</pre>

---+++ *UC_UCASE*

<br />Converts given UTF-32 string to upper-case according to the specified locale.

Arguments:
   * target - shared UTF-32 string, upper-case version is written here
   * source - UTF-32 string, string to convert to upper-case
   * locale - RAW string, locale name
   * status - shared uint32, set to ICU error code if conversion fails
   * length - length of source string (for performance)
NOTE: you may safely pass the same field for both target and source.

Example:

<pre>      SET     TST WORK UCODE 32             =     abc123XYZmnop
      PASS    TST WORK UCODE 4096           FIELD           SHARE? Y
      PASS    TST WORK UCODE 32             FIELD           SHARE? N
      PASS        en                        FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? N
      CALL        .UC_UCASE                 RESIDENT? N END? N FAIL 0
      *
      *       WORK UCODE 4096 contains ABC123XYZNMOP
      *
</pre>

---+++ *UC_LCASE*

<br />Converts given UTF-32 string to lower-case according to the specified locale.

Arguments:
   * target - shared UTF-32 string, lower-case version is written here
   * source - UTF-32 string, string to convert to lower-case
   * locale - RAW string, locale name
   * status - shared uint32, set to ICU error code if conversion fail
   * length - length of source string (for performance)
NOTE: you may safely pass the same field for both target and source.

Example:

<pre>      SET     TST WORK UCODE 32             =     abc123XYZmnop
      PASS    TST WORK UCODE 4096           FIELD           SHARE? Y
      PASS    TST WORK UCODE 32             FIELD           SHARE? N
      PASS        en                        FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? N
      CALL        .UC_LCASE                 RESIDENT? N END? N FAIL 0
      *
      *       WORK UCODE 4096 contains abc123xyzmnop
      *
</pre>

---+++ *UC_TCASE*

<br />Converts given UTF-32 string to title-case according to the specified locale.

Arguments:
   * target - shared UTF-32 string, title-case version is written here
   * source - UTF-32 string, string to convert to title-case
   * locale - RAW string, locale name
   * status - shared uint32, set to ICU error code if conversion fails
   * length - length of source string (for performance)
NOTE: you may safely pass the same field for both target and source,

Example:

<pre>      SET     TST WORK UCODE 32             =     This is a test
      PASS    TST WORK UCODE 4096           FIELD           SHARE? Y
      PASS    TST WORK UCODE 32             FIELD           SHARE? N
      PASS        en                        FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? N
      CALL        .UC_TCASE                 RESIDENT? N END? N FAIL 0
      *
      *       WORK UCODE 4096 contains This Is A Test
      *
 </pre>

---+++ *UC_FROM_UCODE*

<br />Converts the given string from UTF-32 encoding to the specified encoding

Arguments:
   * target - shared RAW string, transcoded value is written here
   * source - UTF-32 string, specifies UTF32- string to transcode
   * encoding- RAW string, desired encoding
   * action - RAW string, action to take on conversion error (see below)
   * option - UTF32-string, escape type or substitution string (see below)
   * status - shared uint32, set to ICU error code if conversion fails
   * length - shared uint32, returns the length of the target field.
You can use "action" to control the behavior of this function if a conversion error occurs (in this function, an error occurs when a UTF-32 character cannot be transcoded into the specified encoding). "action" can be one of the following:
   * STOP - conversion stops on first error
   * SKIP - conversion skip offending character
   * SUBS - UC_FROM_UCODE() substitutes the string specified by "option" in place of the offending character (note: "option" must specify a UTF-32 string which can be converted into the specified encoding)
   * ESCAPE - offending character is escaped according to the (first character of the) value of "option", as shown below
      * C - specifies C-style escaping (\uXXXX or \UXXXXXXXX)
      * STYLE - specifies CSS2 escaping (\XXXXXX )
      * JAVA - specifies Java escaping (\uXXXX)
      * UNICODE - specifies Unicode escaping {U+XXXXX}
      * DECIMAL - specifies XML decimal escaping ()
      * X - specifies XML hex escaping ()
Example:

<pre>      SET     TST WORK UCODE 32             =     This is a test
      *
      PASS    --- TEMP 32                   FIELD           SHARE? Y
      PASS    TST WORK UCODE 32             FIELD           SHARE? Y
      PASS        UTF-8                     FIELD           SHARE? N
      PASS        ESCAPE                    FIELD           SHARE? N
      PASS        C                         FIELD           SHARE? N
      PASS    --- SI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? Y
      CALL        .UC_FROM_UCODE            RESIDENT? N END? N FAIL 0
      *
      *       --- TEMP 32 contains "This is a test" in UTF-8 encoding
      *
</pre>

Note that your target field should be a RAW alpha 4 times the size (in characters) of your UTF-32 field.
---+++ *UC_TO_UCODE*

<br />Converts a string from the specified encoding to UTF-32.

Arguments:
   * target - shared UTF-32 string, transcoded value is written here
   * source - RAW string, specifies to transcode
   * encoding- RAW string, desired encoding
   * action - RAW string, action to take on conversion error (see below)
   * option - UTF32-string, escape type or substitution string (see below)
   * status - shared uint32, set to ICU error code if conversion fails
   * length - shared uint32, returns the length of the target field.
   * source length - the length of the source text (performance boost)
You can use "action" to control the behavior of this function if a conversion error occurs (in this function, an error occurs when a codepage character cannot be transcoded into UTF-32). "action" can be one of the following:
   * STOP - conversion stops on first error
   * SKIP - conversion skip offending character
   * SUBS - UC_TO_UCODE() substitutes the string specified by "option" in place of the offending character
   * ESCAPE - offending character is escaped according to the (first character of the) value of "option", as shown below
      * C - specifies C-style escaping (\xXXXX)
      * STYLE - specifies CSS2 escaping (\XXXXXX )
      * JAVA - specifies Java escaping (\uXXXX)
      * UNICODE - specifies Unicode escaping (U+XXXXX)
      * DECIMAL - specifies XML decimal escaping ()
      * X - specifies XML hex escaping ()
   * 
Example:

<pre>      PASS    --- TEMP 32K                      FIELD           SHARE? Y
      PASS    TST WORK RAW 32K                  FIELD           SHARE? Y
      PASS        UTF-16LE                  FIELD           SHARE? N
      PASS        ESCAPE                    FIELD           SHARE? N
      PASS        DECIMAL                   FIELD           SHARE? N
      PASS    --- SI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? Y
      PASS    --- II                        FIELD           SHARE? N
      CALL        .UC_TO_UCODE              RESIDENT? N END? N FAIL 0
      *
      *      TEMP 32K contains UTF-32 version of TST WORK RAW 32K 
      *      (which was encoded in UTF-16LE form)
</pre>

Note that your target field should be a UTF-32 field with the same number of *characters* as your source field. You can use a RAW alpha, but it must be 4 times the number of characters in the source field, plus 1, in bytes. For example, to convert TEMP 8k (8192 characters), your target RAW alpha field must be (4*8192)+1 = 32769 bytes or larger.
---+++ *UC_ENUMERATE_CNV*

*THIS IS WRONG, see the 0-app routine .ENV GET ENCODINGS for current usage.*

<br />Returns the name of an encoding which can be specified calling UC_TO_UCODE or UT_FROM_UCODE. To obtain the name of each encoding supported by the ICU library, initialize a uint32 (---AI for example) to 0, call UC_ENUMERATE_CNV, save the name returned in the target argument, increment, the uint32 and repeat until the CALL statement sets the next T/F flag to F.

Arguments:
   * target - shared RAW string, the name of an encoding is written here
   * iterator - uint32, a number indicating which encoding to enumerate
Example:

<pre>      LABEL   :GET NEXT
      PASS    --- WORK RAW 132              FIELD           SHARE? Y
      PASS    --- AI                        FIELD           SHARE? N
      CALL        .UC_ENUMERATE_CNV         RESIDENT? N END? N FAIL 0
      *
      *       --- TEMP 80 contains the name of an encoding
      *
      COMPUTE --- AI                        +     1
T     GOTO    :GET NEXT
</pre>

---+++ *UC_CHAR_NAME*

<br />Returns the name of the first character in a UTF-32 string.

Arguments:
   * target - shared RAW string, the name of the character is written here
   * source - UTF32-string, specifies the character of interest
   * status - shared uint32, set to ICU error code if an error occurs
Example:

<pre>      SET     --- CI                        =     8364
      PASS    --- WORK RAW 30               FIELD           SHARE? Y
      PASS    --- CI                        FIELD           SHARE? Y
      PASS    --- AI                        FIELD           SHARE? Y
      CALL        .UC_CHAR_NAME             RESIDENT? N END? N FAIL 0
      *
      *       --- WORK RAW 30 contains "EURO SIGN"
      *
 </pre>

---+++ *UC_CHAR_BY_NAME*

<br />Returns the character whose name is specified.

Arguments:
   * target - shared UTF-32 string, the requested character is written into the first character position (the rest of string is unchanged)
   * name - name of the character of interest
   * status - shared uint32, set to ICU error code if an error occurs.
Example:

<pre>      PASS    TST WORK UCODE 1              FIELD           SHARE? Y
      PASS        COMMERCIAL AT             FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      CALL        .UC_CHAR_BY_NAME          RESIDENT? N END? N FAIL 0
      *
      *       WORK UCODE 1 contains "@"
      *
</pre>

---+++ *UC_ERRORCODE*

<br />Returns a programmer-friendly interpretation of an ICU error code (note: this does not return a message suitable for an end-user).

Arguments:
   * target - shared RAW string, set to the text form of the given error code (U_ZERO_ERROR, U_BUFFER_OVERFLOW, ...)
   * status - uint32, a numeric error code returned by some other uc_xxx function
Example:

<pre>      PASS    TST WORK UCODE 1              FIELD           SHARE? Y
      PASS        SILLY NAME                FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      CALL        .UC_CHAR_BY_NAME          RESIDENT? N END? N FAIL 0
F     PASS    --- WORK RAW 32               FIELD           SHARE? Y
F     PASS    --- AI                        FIELD           SHARE? N
F     CALL        .UC_ERRORCODE             RESIDENT?N  END? N FAIL 0
*
*       --- WORK RAW 32 contains "U_ILLEGAL_CHAR_FOUND"
*
</pre>

---++ Test Plan

Do the above callable functions produce the results stated?

   * Start by using the example code and checking the results
   * Then try different values and/or data types to see if the correct result is produced in each case.
---++ Bugs
This topic: Main > UnicodeTranscodingFn
Topic revision: r10 - 2018-01-03 - JeanNeron