The computer code page architectures of today are the direct descendants of earlier text processing technologies. Specifically, binary and numeric codes for individual letters have been used since the earliest days of the telegraph and teletype. Punched card technology, the immediate predecessor of modern electronic computers, also used binary and numeric codes.
In the 1950s, the American Standards Association (ASA) recognized the need for standardized text encoding for computing. Between 1963 and 1968, ASA introduced the American Standard Code for Information Interchange (ASCII), a 7-bit (128 values) encoding scheme that covered 32 control codes and 96 written symbols for the English language. This encoding scheme was widely implemented in data communications and computer operations as 8-bit (single-byte) codes, with the eighth bit used as a parity check for reliability.
The 7-bit ASCII, however, could not handle the additional national characters, such as French accented letters or German umlauts, used in European languages. A number of standards organizations, including ISO (International Organization for Standardization), proposed ways to include these additional letters.
These proposals led to the creation of a set of 8-bit extended ASCII layouts, in which the first 128 values are the same as the original English ASCII, and the higher 128 values are used either for European national characters, or for letters in an unrelated script such as Greek, Cyrillic (Russian), Hebrew, Arabic, or Thai. One important member of the 8-bit set is called Latin-1, or ANSI (American National Standards Institute). The higher 128 values include all the non-English national characters used in Western Europe, North America, and South America.
During the decades when these encoding schemes were developed, they were simultaneously implemented as code pages to handle the related problem of text processing in electronic computers. At the time, computing was divided between IBM and "everybody else."
Beginning in 1964, IBM implemented a set of 57 proprietary Extended Binary Coded Decimal Interchange Code (EBCDIC) code pages based on its own earlier text processing technologies. These single-byte layouts were completely different from ASCII, but often included most of the same symbols. "Everybody else"—the other computer vendors—used ASCII, with differences in ASCII implementation among the vendors.
Today, we think of code pages as being ASCII, ANSI, or EBCDIC, with ASCII and ANSI closely related because they have the same values for the English letters. However, each of these code page sets is actually a family of related code pages that handle languages and scripts other than English.
WebFOCUS | |
Feedback |