UTF-EBCDIC

Revision as of 20:59, 5 May 2024 by imported>Spitzak (→‎top)
(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Template:Short descriptionTemplate:Infobox character encoding

UTF-EBCDIC is a character encoding capable of encoding all 1,112,064 valid character code points in Unicode using 1 to 5 bytes (in contrast to a maximum of 4 for UTF-8).<ref>{{#invoke:citation/CS1|citation |CitationClass=web }}</ref> It is meant to be EBCDIC-friendly, so that legacy EBCDIC applications on mainframes may process the characters without much difficulty. Its advantages for existing EBCDIC-based systems are similar to UTF-8's advantages for existing ASCII-based systems. Details on UTF-EBCDIC are defined in Unicode Technical Report #16.

To produce the UTF-EBCDIC encoded version of a series of Unicode code points, an encoding based on UTF-8 (known in the specification as UTF-8-Mod) is applied first (creating what the specification calls an I8 sequence). The main difference between this encoding and UTF-8 is that it allows Unicode code points Template:Tt through Template:Tt (the C1 control codes) to be represented as a single byte and therefore later mapped to corresponding EBCDIC control codes. In order to achieve this, UTF-8-Mod uses Template:Tt instead of Template:Tt as the format for trailing bytes in a multi-byte sequence. As this can only hold 5 bits rather than 6, the UTF-8-Mod encoding of codepoints above Template:Tt are larger than the UTF-8 encoding.

The UTF-8-Mod transformation leaves the data in an ASCII-based format (for example, Template:Tt "A" is still encoded as Template:Tt), so each byte is fed through a reversible (one-to-one) lookup table to produce the final UTF-EBCDIC encoding. For example, Template:Tt in this table maps to Template:Tt; thus the UTF-EBCDIC encoding of Template:Tt (Unicode's "A") is Template:Tt (EBCDIC's "A").

UTF-EBCDIC is rarely used, even on the EBCDIC-based mainframes for which it was designed. IBM EBCDIC-based mainframe operating systems, such as z/OS, usually use UTF-16 for complete Unicode support. For example, IBM Db2, COBOL, PL/I, Java and the IBM XML toolkit support UTF-16 on IBM mainframes.

Codepage layoutEdit

There are 160 characters with single-byte encodings in UTF-EBCDIC (compared to 128 in UTF-8). As can be seen, the single-byte portion is similar to IBM-1047 instead of IBM-37 due to the location of the square brackets. CCSID 37 has [] at hex BA and BB instead of at hex AD and BD respectively.

Template:Chset-left1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1
Template:Chset-left1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1
Template:Chset-left1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1
Template:Chset-left1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1 Template:Chset-ctrl1
Template:Chset-left1 Template:Chset-ctrl1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1
Template:Chset-left1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1
Template:Chset-left1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1
Template:Chset-left1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1
Template:Chset-left1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1
Template:Chset-left1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1
Template:Chset-left1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1
Template:Chset-left1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1
Template:Chset-left1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1
Template:Chset-left1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1
Template:Chset-left1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1
Template:Chset-left1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-cell1 Template:Chset-ctrl1

Template:Legend Template:Legend Template:Legend Template:Legend

Template:AnchorOracle UTFEEdit

Oracle UTFE is a Unicode 3.0 UTF-8 Oracle database variation, similar to the CESU-8 variant of UTF-8, where supplementary characters are encoded as two 4-byte characters rather than a single 4- or 5-byte character. It is used only on EBCDIC platforms.<ref name="Oracle_2002_DGSG">Template:Cite book</ref>

See alsoEdit

ReferencesEdit

Template:Reflist

External linksEdit

Template:Unicode navigation Template:Character encoding