Tags:
create new tag
view all tags

ILF Callable Unicode Engine Functions

Overview

UC_STRCASECMP


Performs a case-insensitive comparison of two UTF-32 values, returning 0 if the strings should be considered equal, a negative number if the first value is "less than" the second value, and a positive number if the first value is "greater than" the second value.

Arguments:

  • result - shared uint32, set to comparison result as described above
  • left - UTF-32 string, left string value
  • right - UTF-32 string, right string value
  • status - shared uint32, set to ICU error code if comparison fails
Example:

      SET     TST WORK UCODE 64 X 16    001 =     ABC123
      SET     TST WORK UCODE 64 X 16    002 =     Abc1234
      *
      PASS    --- AI                        FIELD           SHARE? Y
      PASS    TST WORK UCODE 64 X 16    001 FIELD           SHARE? N
      PASS    TST WORK UCODE 64 X 16    002 FIELD           SHARE? N
      PASS    --- BI                        FIELD           SHARE? Y
      CALL        .UC_STRCASECMP            RESIDENT? N END? N FAIL 0
      *
      *       --- AI contains a negative number
      *

UC_LEN


Returns the number of characters (not bytes) in the given string, minus trailing spaces.

Arguments:

  • string - UTF-32 string, value to measure
  • length - length of string (performance enhancement to reduce time to determine string length)
NOTE: length is returned in --- RETURN CODE, the T/F flag returned by this call is meaningless

Example:

      SET     TST WORK UCODE 4096           =     ABC123XYZ
      PASS    TST WORK UCODE 4096           FIELD           SHARE? N
      PASS    --- XI                        FIELD           SHARE? N
      CALL        .UC_LEN                   RESIDENT? N END? N FAIL 0
      *
      *       --- RETURN CODE is set to 9
      *

UC_UCASE


Converts given UTF-32 string to upper-case according to the specified locale.

Arguments:

  • target - shared UTF-32 string, upper-case version is written here
  • source - UTF-32 string, string to convert to upper-case
  • locale - RAW string, locale name
  • status - shared uint32, set to ICU error code if conversion fails
  • length - length of source string (for performance)
NOTE: you may safely pass the same field for both target and source.

Example:

      SET     TST WORK UCODE 32             =     abc123XYZmnop
      PASS    TST WORK UCODE 4096           FIELD           SHARE? Y
      PASS    TST WORK UCODE 32             FIELD           SHARE? N
      PASS        en                        FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? N
      CALL        .UC_UCASE                 RESIDENT? N END? N FAIL 0
      *
      *       WORK UCODE 4096 contains ABC123XYZNMOP
      *

UC_LCASE


Converts given UTF-32 string to lower-case according to the specified locale.

Arguments:

  • target - shared UTF-32 string, lower-case version is written here
  • source - UTF-32 string, string to convert to lower-case
  • locale - RAW string, locale name
  • status - shared uint32, set to ICU error code if conversion fail
  • length - length of source string (for performance)
NOTE: you may safely pass the same field for both target and source.

Example:

      SET     TST WORK UCODE 32             =     abc123XYZmnop
      PASS    TST WORK UCODE 4096           FIELD           SHARE? Y
      PASS    TST WORK UCODE 32             FIELD           SHARE? N
      PASS        en                        FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? N
      CALL        .UC_LCASE                 RESIDENT? N END? N FAIL 0
      *
      *       WORK UCODE 4096 contains abc123xyzmnop
      *

UC_TCASE


Converts given UTF-32 string to title-case according to the specified locale.

Arguments:

  • target - shared UTF-32 string, title-case version is written here
  • source - UTF-32 string, string to convert to title-case
  • locale - RAW string, locale name
  • status - shared uint32, set to ICU error code if conversion fails
  • length - length of source string (for performance)
NOTE: you may safely pass the same field for both target and source,

Example:

      SET     TST WORK UCODE 32             =     This is a test
      PASS    TST WORK UCODE 4096           FIELD           SHARE? Y
      PASS    TST WORK UCODE 32             FIELD           SHARE? N
      PASS        en                        FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? N
      CALL        .UC_TCASE                 RESIDENT? N END? N FAIL 0
      *
      *       WORK UCODE 4096 contains This Is A Test
      *
 

UC_FROM_UCODE


Converts the given string from UTF-32 encoding to the specified encoding

Arguments:

  • target - shared RAW string, transcoded value is written here
  • source - UTF-32 string, specifies UTF32- string to transcode
  • encoding- RAW string, desired encoding
  • action - RAW string, action to take on conversion error (see below)
  • option - UTF32-string, escape type or substitution string (see below)
  • status - shared uint32, set to ICU error code if conversion fails
  • length - shared uint32, returns the length of the target field.
You can use "action" to control the behavior of this function if a conversion error occurs (in this function, an error occurs when a UTF-32 character cannot be transcoded into the specified encoding). "action" can be one of the following:
  • STOP - conversion stops on first error
  • SKIP - conversion skip offending character
  • SUBS - UC_FROM_UCODE() substitutes the string specified by "option" in place of the offending character (note: "option" must specify a UTF-32 string which can be converted into the specified encoding)
  • ESCAPE - offending character is escaped according to the (first character of the) value of "option", as shown below
    • C - specifies C-style escaping (\uXXXX or \UXXXXXXXX)
    • STYLE - specifies CSS2 escaping (\XXXXXX )
    • JAVA - specifies Java escaping (\uXXXX)
    • UNICODE - specifies Unicode escaping {U+XXXXX}
    • DECIMAL - specifies XML decimal escaping ()
    • X - specifies XML hex escaping ()
Example:

      SET     TST WORK UCODE 32             =     This is a test
      *
      PASS    --- TEMP 32                   FIELD           SHARE? Y
      PASS    TST WORK UCODE 32             FIELD           SHARE? Y
      PASS        UTF-8                     FIELD           SHARE? N
      PASS        ESCAPE                    FIELD           SHARE? N
      PASS        C                         FIELD           SHARE? N
      PASS    --- SI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? Y
      CALL        .UC_FROM_UCODE            RESIDENT? N END? N FAIL 0
      *
      *       --- TEMP 32 contains "This is a test" in UTF-8 encoding
      *

Note that your target field should be a RAW alpha 4 times the size (in characters) of your UTF-32 field.

UC_TO_UCODE


Converts a string from the specified encoding to UTF-32.

Arguments:

  • target - shared UTF-32 string, transcoded value is written here
  • source - RAW string, specifies to transcode
  • encoding- RAW string, desired encoding
  • action - RAW string, action to take on conversion error (see below)
  • option - UTF32-string, escape type or substitution string (see below)
  • status - shared uint32, set to ICU error code if conversion fails
  • length - shared uint32, returns the length of the target field.
  • source length - the length of the source text (performance boost)
You can use "action" to control the behavior of this function if a conversion error occurs (in this function, an error occurs when a codepage character cannot be transcoded into UTF-32). "action" can be one of the following:
  • STOP - conversion stops on first error
  • SKIP - conversion skip offending character
  • SUBS - UC_TO_UCODE() substitutes the string specified by "option" in place of the offending character
  • ESCAPE - offending character is escaped according to the (first character of the) value of "option", as shown below
    • C - specifies C-style escaping (\xXXXX)
    • STYLE - specifies CSS2 escaping (\XXXXXX )
    • JAVA - specifies Java escaping (\uXXXX)
    • UNICODE - specifies Unicode escaping (U+XXXXX)
    • DECIMAL - specifies XML decimal escaping ()
    • X - specifies XML hex escaping ()
Example:

      PASS    --- TEMP 32K                      FIELD           SHARE? Y
      PASS    TST WORK RAW 32K                  FIELD           SHARE? Y
      PASS        UTF-16LE                  FIELD           SHARE? N
      PASS        ESCAPE                    FIELD           SHARE? N
      PASS        DECIMAL                   FIELD           SHARE? N
      PASS    --- SI                        FIELD           SHARE? Y
      PASS    --- LI                        FIELD           SHARE? Y
      PASS    --- II                        FIELD           SHARE? N
      CALL        .UC_TO_UCODE              RESIDENT? N END? N FAIL 0
      *
      *      TEMP 32K contains UTF-32 version of TST WORK RAW 32K 
      *      (which was encoded in UTF-16LE form)

Note that your target field should be a UTF-32 field with the same number of characters as your source field. You can use a RAW alpha, but it must be 4 times the number of characters in the source field, plus 1, in bytes. For example, to convert TEMP 8k (8192 characters), your target RAW alpha field must be (4*8192)+1 = 32769 bytes or larger.

UC_ENUMERATE_CNV

THIS IS WRONG, see the 0-app routine .ENV GET ENCODINGS for current usage.


Returns the name of an encoding which can be specified calling UC_TO_UCODE or UT_FROM_UCODE. To obtain the name of each encoding supported by the ICU library, initialize a uint32 (---AI for example) to 0, call UC_ENUMERATE_CNV, save the name returned in the target argument, increment, the uint32 and repeat until the CALL statement sets the next T/F flag to F.

Arguments:

  • target - shared RAW string, the name of an encoding is written here
  • iterator - uint32, a number indicating which encoding to enumerate
Example:

      LABEL   :GET NEXT
      PASS    --- WORK RAW 132              FIELD           SHARE? Y
      PASS    --- AI                        FIELD           SHARE? N
      CALL        .UC_ENUMERATE_CNV         RESIDENT? N END? N FAIL 0
      *
      *       --- TEMP 80 contains the name of an encoding
      *
      COMPUTE --- AI                        +     1
T     GOTO    :GET NEXT

UC_CHAR_NAME


Returns the name of the first character in a UTF-32 string.

Arguments:

  • target - shared RAW string, the name of the character is written here
  • source - UTF32-string, specifies the character of interest
  • status - shared uint32, set to ICU error code if an error occurs
Example:

      SET     --- CI                        =     8364
      PASS    --- WORK RAW 30               FIELD           SHARE? Y
      PASS    --- CI                        FIELD           SHARE? Y
      PASS    --- AI                        FIELD           SHARE? Y
      CALL        .UC_CHAR_NAME             RESIDENT? N END? N FAIL 0
      *
      *       --- WORK RAW 30 contains "EURO SIGN"
      *
 

UC_CHAR_BY_NAME


Returns the character whose name is specified.

Arguments:

  • target - shared UTF-32 string, the requested character is written into the first character position (the rest of string is unchanged)
  • name - name of the character of interest
  • status - shared uint32, set to ICU error code if an error occurs.
Example:

      PASS    TST WORK UCODE 1              FIELD           SHARE? Y
      PASS        COMMERCIAL AT             FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      CALL        .UC_CHAR_BY_NAME          RESIDENT? N END? N FAIL 0
      *
      *       WORK UCODE 1 contains "@"
      *

UC_ERRORCODE


Returns a programmer-friendly interpretation of an ICU error code (note: this does not return a message suitable for an end-user).

Arguments:

  • target - shared RAW string, set to the text form of the given error code (U_ZERO_ERROR, U_BUFFER_OVERFLOW, ...)
  • status - uint32, a numeric error code returned by some other uc_xxx function
Example:

      PASS    TST WORK UCODE 1              FIELD           SHARE? Y
      PASS        SILLY NAME                FIELD           SHARE? N
      PASS    --- AI                        FIELD           SHARE? Y
      CALL        .UC_CHAR_BY_NAME          RESIDENT? N END? N FAIL 0
F     PASS    --- WORK RAW 32               FIELD           SHARE? Y
F     PASS    --- AI                        FIELD           SHARE? N
F     CALL        .UC_ERRORCODE             RESIDENT?N  END? N FAIL 0
*
*       --- WORK RAW 32 contains "U_ILLEGAL_CHAR_FOUND"
*

Test Plan

Do the above callable functions produce the results stated?

  • Start by using the example code and checking the results
  • Then try different values and/or data types to see if the correct result is produced in each case.

Bugs

Edit | Attach | Watch | Print version | History: r10 < r9 < r8 < r7 < r6 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r10 - 2018-01-03 - JeanNeron
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback