Extended Binary Coded Decimal Interchange Code (
EBCDIC) is an 8-
bit character encoding (
code page) used on
IBM mainframe operating systems such as
z/OS,
OS/390,
VM and
VSE, as well as
IBM midrange computer operating systems such as
OS/400 and
i5/OS (see also
Binary Coded Decimal). It is also employed on various non-IBM platforms such as
Fujitsu-
Siemens'
BS2000/OSD,
HP MPE/iX, and
Unisys MCP. EBCDIC descended from the code used with
punched cards and the corresponding six bit
binary-coded decimal code used with most of IBM's
computer peripherals of the late 1950s and early 1960s.
History
EBCDIC (pronounced /ˈɛbsəˌdɪk/) was devised in 1963 and 1964 by
IBM and was announced with the release of the
IBM System/360 line of mainframe
computers. It was created to extend the
Binary-Coded Decimal encoding that existed at the time. It is an 8-bit character encoding, in contrast to, and developed separately from, the 7-bit
ASCII encoding scheme.
While IBM was a chief proponent of the ASCII standardization committee, they did not have time to prepare ASCII peripherals (such as card punch machines) to ship with its
System/360 computers, so the company settled on EBCDIC at the time. The System/360 became wildly successful, and thus so did EBCDIC.
All IBM mainframe
peripherals and
operating systems (except
Linux on zSeries or
iSeries) use EBCDIC as their inherent encoding,
but software can translate to and from other encodings. Many hardware peripherals provide translation as well and modern mainframes (such as IBM
zSeries) include processor instructions, at the hardware level, to accelerate translation between character sets.
At the time it was devised, EBCDIC made it relatively easy to enter data into a computer with
punch cards. Since punch cards are no longer used on mainframes, EBCDIC is used in modern mainframes primarily for backwards compatibility. It does have an advantage of limiting the number of hole punches per column to 2 holes for uppercase and numbers, which increases the durability of these punch cards as they are handled by a card reader. This encoding is also known as
Hollerith code.
EBCDIC has no modern technical advantage over ASCII-based code pages such as the
ISO-8859 series or
Unicode. There are some technical niceties in each, e.g., ASCII and EBCDIC both have one bit which indicates upper or lower case. But there are some aspects of EBCDIC which make it much less pleasant to work with than ASCII (such as a non-contiguous alphabet). As with single-byte
extended ASCII codepages, most EBCDIC codepages only allow up to 2 languages (English and one other language) to be used in a
database or text file.
Where true support for multilingual text is desired, a system supporting far more characters is needed. Generally this is done with some form of Unicode support. There is an EBCDIC
Unicode Transformation Format called
UTF-EBCDIC proposed by the Unicode consortium, but it is not intended to be used in open interchange environments and, even on EBCDIC-based systems, it is almost never used. IBM mainframes support
UTF-16, but they do not support UTF-EBCDIC natively.
Arabic EBCDIC versions are typically in presentation order, in left to right order as displayed by an older mainframe or line printer, rather than in the right to left logical order used by modern encodings such as Unicode.
Codepage layout
The table below is derived from
CCSID 500, one of the code page variants of EBCDIC, showing only the basic (English) EBCDIC characters. Characters 00–3F and FF are
controls, 40 is
space, 41 is
no-break space (
RSP: "Required Space"), E1 is numeric space (
NSP: "Numeric Space"), and CA is
soft hyphen. Characters are shown with their equivalent
Unicode codes. Invariant alphanumeric, punctuation, and control characters common to all EBCDIC code pages are shown in color. Unassigned codes are typically filled with international or region-specific characters in the various EBCDIC
code page variants.
Criticism and humor
Open-source-software advocate and hacker
Eric S. Raymond writes in his
Jargon File that EBCDIC was almost universally loathed by early hackers and programmers because of its multitude of different versions, none of which resembled the other versions, and that IBM produced it in direct competition with the already-established
ASCII.
The Jargon file 4.4.7 gives the following definition:
Another popular complaint is that the EBCDIC alphabetic characters follow an archaic punch card encoding rather than a linear ordering like ASCII. One consequence of this is that incrementing the character code for "I" does not produce the code for "J", and likewise there is a gap between the codes for "R" and "S". Thus programming a simple
control loop to cycle through only the alphabetic characters is problematic.
These incompatibilities were also the source of many jokes. A popular one went:
Professor: "So the American government went to IBM to come up with an encryption standard, and they came up with—"
Student: "EBCDIC!" A reference to the EBCDIC character set is made in the classic Infocom adventure game
Zork II. In the "Machine Room", there is a collection of ancient computers and other machines of uncertain purpose. The following is the description of the room, with EBCDIC used to imply an incomprehensible language:
See also
Sources