ASCII

FIELDS OF STUDY

Computer Science; Computer Engineering

ABSTRACT

The American Standard Code for Information Interchange (ASCII) is a character encoding system. It enables computers and other electronic communication devices to store, process, transmit, print, and display text and other graphic characters. Initially published in 1963, ASCII formed the basis for several other character encoding systems developed for use with PCs and the Internet.

PRINICIPAL TERMS

UNDERSTANDING CHARACTER ENCODING

Written language, or text, is composed of a variety of graphic symbols called characters. In many languages, these characters include letters, numbers, punctuation marks, and blank spaces. Such characters are also called printable characters because they can be printed or otherwise displayed in a form that can be read by humans. Another type of character is a control character. Control characters effect the processing of other characters. For example, a control character might instruct a printer to print the next character on a new line. Character encoding is the process of converting characters into a format that can be used by an electronic device such as a computer or telegraph.

Originally designed for use with Samuel Morse's telegraph system, Morse code was one of the first character encoding schemes adopted for widespread use. Telegraphs transmit information by sending electronic pulses over telegraph wires. Morse code assigns each character to a unique combination of short and long pulses. For example, the letter A was assigned to the combination of one short followed by one long pulse, while the letter T was assigned to a single long pulse. Using Morse code, a telegraph operator can send messages by transmitting a sequence of pulses. The sequence, or string, of pulses represents the characters that comprise the message text.

Other character encoding systems were created to meet the needs of new types of electronic devices including teleprinters and computers. By the early 1960s, the use of character encoding systems had become widespread. However, no standard character encoding system existed to ensure that systems from different manufacturers could communicate with each other. In fact, by 1963, over sixty different encoding systems were in use. Nine different systems were used by IBM alone. To address this issue, the American Standards Association (ASA) X3.4 Committee developed a standardized character encoding scheme called ASCII.

UNDERSTANDING THE ASCII STANDARD

The ASCII standard is based on English. It encodes 128 characters into integer values from 0 to 127. Thirty-three of the characters are control characters, and ninety-five are printable characters that include the upper- and lowercase letters from A to Z, the numbers zero to nine, punctuation marks, and a blank space. For example, the letter A is encoded as 65 and a comma as 44.

The encoded integers are then converted to bits, the smallest unit of data that can be stored by a computer system. A single bit can have a value of either zero or one. In order to store integers larger than one, additional bits must be used. The number of bits used to store a value is called the bit width. ASCII specifies a bit width of seven. For example, in ASCII, the integer value 65 is stored using seven bits, which can be represented as the bit string 1000001.




This chart presents the decimal and hexadecimal ASCII codes for common characters on a keyboard. Public domain, via Wikimedia Commons





This chart presents the decimal and hexadecimal ASCII codes for common characters on a keyboard.
Public domain, via Wikimedia Commons
SAMPLE PROBLEM

ASCII defines the integer values for the first eleven lowercase letters of the alphabet as follows:




PRINICIPAL TERMS

Using this information, translate the word hijack to the correct ASCII integer values.

Answer:

The ASCII representation of the word hijack can be determined by comparing each character in the word to its defined decimal value as follows:




PRINICIPAL TERMS

The correct ASCII encoding for the word hijack is 104 105 106 97 99 107.

The ASCII seven-bit integer values for specific characters were not randomly assigned. Rather, the integer values of specific characters were selected to maximize the hamming distance between each value. Hamming distance is the number of bits set to different values when comparing two bit strings. For example, the bit strings 0000001 (decimal value 1) and 0000011 (decimal value 3) have a hamming distance of 1 as only the second to last bit differs between the two strings. The bit patterns 0000111 (decimal value 7) and 0000001 (decimal value 1) have a hamming distance of two as the bit in the third to last position also differs between the two strings. ASCII was designed to maximize hamming distance because larger hamming distances enable more efficient data processing as well as improved error detection and handling.

—Maura Valentino, MSLIS

Amer. Standards Assn. American Standard Code for Information Interchange. Amer. Standards Assn., 17 June 1963. Digital file.

Anderson, Deborah. “Global Linguistic Diversity for the Internet.” Communications of the ACM Jan. 2005: 27. PDF file.

Fischer, Eric. The Evolution of Character Codes, 1874–1968. N.p.: Fischer, n.d. Trafficways.org. Web. 22 Feb. 2016.

Jennings, Tom. “An Annotated History of Some Character Codes.” World Power Systems. Tom Jennings, 29 Oct. 2004. Web. 16 Feb. 2016.

McConnell, Robert, James Haynes, and Richard Warren. “Understanding ASCII Codes.” NADCOMM. NADCOMM, 14 May 2011. Web. 16 Feb. 2016.

Silver, H. Ward. “Digital Code Basics.” Qst 98.8 (2014): 58–59. PDF file.

“Timeline of Computer History: 1963.” Computer History Museum. Computer History Museum, 1 May 2015. Web. 23 Feb. 2016.