Difference: UnicodeSETStmt (3 vs. 4)

Line: 1 to 1

META TOPICPARENT	name="UnicodeTestPlan"

SET Statement

Line: 14 to 13

Changed:

<
<

- Each byte of data in the RAW alpha source field is transcoded from an 8859-15 character to the equivalent UTF-32 character in the range U+0000 thru U+00FF and is then set into the corresponding character position in the UTF-32 encoded destination field.

>
>

- Each byte of data in the RAW alpha source field is transcoded from an 8859-15 character for RAW or the defined NATION encoding to the equivalent UTF-32 character in the range U+0000 thru U+00FF and is then set into the corresponding character position in the UTF-32 encoded destination field.
- The destination is truncated if the number of source bytes is greater than the number of destination characters.
- UTF-32 padding takes place if the number of source bytes is less than the number of destination characters.

Set from Group:

To RAW Alpha or Group
- he current behavior has not been changed.
To UNICODE or NATIONAL Alpha
- This combination will not be allowed.
- The process compiler should report this as an error.
- Since many PDF fields and TEMP fields are redefined as UNICODE this will likely break some applications.

Set from UNICODE or NATIONAL Alpha:

To RAW or NATIONAL Alpha
- Each UTF-32 character in the source field is transcoded into 8859-15 characters for RAW or the defined NATIONAL encoding in the range 0x00 to 0xFF and is then set into the corresponding byte postiion in the destination field.
- The destination is truncated if the number of source characters is greater than the number of destination bytes.
- 8-bit space padding takes place if the number of source characters is less than the number of destination bytes.
To Group:
- This combination will not be allowed.
- The process compiler should report this as an error.
- Since many PDF fields and TEMP fields are redefined as UNICODE this will like break some applications.
To UNICODE Alpha:
- Straight copy with no transcoding.
- Expected truncation and UTF-32 padding will take place on length mismatch.

Set from Unicode Literal:

To RAW Alpha or Group
- Normal literals unchanged
- Unicode Literals are not interpreted as unicode escape sequences.
To UNICODE or NATIONAL Alpha
- Character literals are limited to the defined 8859-15 characters.
- Each character is transcoded from 8859-15 to UTF-32.
- Unicode characters can be embedded in literals using unicode escape sequences. \u#### or \U########
- Truncation and UTF-32 or Space padding will take place.

Bug: Setting a Unicode alpha field equal to a Token field does not work. It appears to do a byte by byte set, not a character set of the Token string. (Note CNV TEXT does work)