Coding and Encryption

ABSTRACT

Mathematical algorithms are used in modern encryption and decryption.

OVERVIEW

Human beings have a propensity to preserve and share secret information. Cryptography, from the Greek kryptos (hidden) and graphein (to write), is the art and science of coding and decoding messages containing secret information. Encryption is the algorithmic process that converts plain-text into cipher-text (looks like a collection of unintelligible symbols), while decryption is the reverse process that converts the cipher-text back to the original plain-text. A cipher algorithm and its associated key control both directions of the sequence, with the code's security level directly related to the algorithm's complexity. The two fundamental types of cryptography are symmetric (or secret keys) or asymmetric (or public-key), with multiple variations. Claude Shannon, an American mathematician and electronic engineer, is known as the father of information theory and cryptography. Some claim that his master's thesis, which demonstrates that electrical applications of Boolean algebra can construct and resolve any logical numerical relationship, is the most important master's thesis of all time.

Around 2000 bce, Egyptian scribes included nonstandard hieroglyphs in carved inscriptions. During war campaigns, Julius Caesar sent coded information to Roman generals. Paul Revere's signal from a Boston bell tower in 1775 is even a simple example of a coded message. Success of the Allies in both World Wars depended on their breaking of the German's Enigma code. With the world-wide need for more sophisticated coding algorithms to transmit secure messages for military forces, businesses, and governments, people began capitalizing on the combined powers of mathematics, computer technology, and engineering.

The simplest examples of ciphers involve either transpositions or substitutions. In 450 bce, the Spartans used transposition ciphers when they wound a narrow belt spirally around a thick staff and wrote a plain-text (or message) along the length of the rod. Once unwound, the belt appeared to be a meaningless sequence of symbols. To decipher the cipher-text, the receiver wound the belt around a similar staff. Variations of transposition ciphers are the route cipher and the Cardan grill.

Julius Caesar used substitution ciphers, where each letter of the plain-text is replaced by some other letter or symbol, using a substitution dictionary. For example, suppose:

Original Alphabet:




Coding and Encryption

Key Dictionary:




Coding and Encryption




A man writing codes on his laptop





A man writing codes on his laptop

where the key dictionary is made by starting with “code” letter K and then writing the alphabet as if on a loop. To encode the plain-text, “The World Is Round,” each letter is substituted by its companion letter, producing the cipher-text “CRO FXAUN SB AXDWN.” To disguise word lengths and to add complexity, the cipher-text was sometimes blocked into fixed-length groups of letters such as “CROF XAUN SBAX DWN.” To decipher the cipher-text, one needed to know only the “code” letter. Though simple and initially confusing, substitution ciphers now are easily broken using frequency patterns of letters and words. Variations of the substitution cipher involve the suppression of letter frequencies, syllabic substitutions, or polyalphabetic substitutions such as the Vigenère or Beaufort ciphers.

The Playfair Square cipher used by Great Britain in World War I is a substitution cipher, but its encryption of letter pairs in place of single letters is more powerful yet easy to use. The cipher-key is a 5 × 5 table initiated by a key word, such as “mathematics.”

Exchange Table Model




Coding and Encryption

The table is built by moving left to right and from top to bottom (or other visual pattern as in a spiral) by first filling in the table's cells with the keyword's letters—avoiding duplicate letters. Then, the subsequent cells are filled with the remaining letters of the alphabet, using the “I” to represent the “J” to reduce the alphabet to 25 letters (instead of 26). Both the coder and the decoder need to know the both the keyword and the conventions used to construct the common cipher-key.

The coder first breaks the plain-text into two-letter pairs and uses the cipher-key via a system of rules:

If double letters occur in the plain-text, insert an X between them.

Rewrite the plain-text as a sequence of two-letter pairs, using an X as a final filler for last letter-pair.

If the two letters lie in the same row, replace each letter by the letter to its right (for example, CS becomes SB).

If the two letters lie in the same column, replace each letter by the letter below it (TS becomes SK and PW becomes WA).

If the two letters lie at corners of a rectangle embedded in the square, replace them by their counterpart in the same rectangle (TB becomes SH and CR becomes PB).

Using this cipher-key, the plain-text “The World Is Round” becomes first




Coding and Encryption

which when encoded, becomes




Coding and Encryption

The same cipher-key is used to decode this message, but the rules are interpreted in reverse. It is quite difficult to decode this cipher-text without access to both the keyword and the conventions to construct the common cipher-key, though very possible.

The problem with all substitution and transposition encryption systems is their dependence on shared secrecy between the coders and the intended decoders. To transmit plain-text via cipher-text and then decode it back to public-text successfully, both parties would have to know and use common systems, common key-words, and common visual arrangements. In turn, privacy is required, since these systems are of no value if the user learns the key-word or is able to use frequency techniques of word/letter patterns to break the code. A more complicated and secure encryption process was needed, but it was not invented until the 1970s.

The revolutionary idea in encryption was the idea of a public key system, where the encryption key is known by everyone (that is, the public). However, the twist was that this knowledge was not useful in figuring out the decryption key, which was not made public. The RSA public-key cipher, invented in 1977 by Ronald Rivest, Adi Shamir, and Leonard Adleman (“RSA” stands for the names of the inventors), all of whom have bachelor's degrees in mathematics and advanced degrees in computer science, is still used today thanks to powerful mathematics and powerful computer systems.

In a RSA system, the “receiver” of the intended message is the driver of the process. In lieu of the “sender,” the receiver chooses both the encryption key and the matching decryption key. In fact, the “receiver” can make the encryption key public in a directory so any “sender” can use it to send secure messages, which only the “receiver” knows how to decrypt. Again, the latter decryption process is not even known by the “sender.”

Because the problem is quite complex and uses both congruence relationships and modular arithmetic, only a sense of the process can be described as follows:

As the “receiver,” start with the product n equal to two very large prime numbers p and q.

Choose a number e relatively prime to (p-1)(q-1).

The published encryption key is the pair (n,e).

Change plain-text letters to equivalent number forms using a conversion such as A=2, B=3, C=4,… , Z=27.

Using the published encryption key, the “sender” encrypts each number z using the formula mz

e mod(n), with the new number sequence being the cipher-text.

To decode the text, the “receiver” not only knows both e and the factors of n but also the large primes p and q as prime factors of n.

Then, the decryption key d is private but can be computed by the “receiver” using an inverse relationship ed ≡ 1 mod(p-1)(q-1),which allows the decoding of the encrypted number into a set of numbers that can be converted back into the plain-text.

The RSA public key system works well, but the required primes p and q have to be very large and often involve more than 300 digits. If they are not large, powerful computers can determine the decryption key d from the given encryption key (n,e) by factoring the number n. This decryption is possible because of the fact that, while computers can easily multiply large numbers, it is much more difficult to factor large numbers on a computer.

Regardless of its type, a cryptographic system must meet multiple characteristics. First, it must reflect the user's abilities and physical context, avoiding extreme complexity and extraneous physical apparatus. Second, it must include some form of error checking, so that small errors in composition or transmission do not render the message into meaningless gibberish. Third, it must ensure that the decoder of the cipher-text will produce a single, meaningful plain-text. There are many mathematicians working for government agencies like the National Security Agency (NSA), as well as for private companies that are developing improved security for storage and transmission of digital information using these principles. In fact, the NSA is the largest employer of mathematicians in the United States.

—Jerry Johnson

Churchhouse, Robert. Codes and Ciphers. Cambridge, England: Cambridge University Press, 2002.

Kahn, David. The Codebreakers: The Story of Secret Writing. New York: Macmillan, 1967.

Lewand, Robert. Cryptological Mathematics. Washington, DC: Mathematical Association of America, 2000.

Smith, Laurence. Cryptography: The Science of Secret Writing. New York: Dover Publications, 1971.