2 minute read

Deoxyribonucleic Acid (DNA)

The Genetic Code

Genetic information is stored as nucleotide sequences in DNA (or RNA) molecules. This sequence specifies the identity and position of the amino acids in a particular protein. Amino acids are the building blocks of proteins in the same way that nucleotides are the building blocks of DNA. However, though there are only four possible bases in DNA (or RNA), there are 20 possible amino acids in proteins. The genetic code is a sort of "bilingual dictionary" which translates the language of DNA into the language of proteins. In the genetic code the letters are the four bases A, C, G, and T (or U instead of T in RNA). Obviously, the four bases of DNA are not enough to code for 20 amino acids. A sequence of two bases is also insufficient, because this permits coding for only 16 of the 20 amino acids in proteins. Therefore, a sequence of three bases is required to ensure enough combinations or "words" to code for all 20 amino acids. Since all words in this DNA language, called codons, consist of three letters, the genetic code is often referred to as the triplet code.

Each codon specifies a particular amino acid. Because there are 64 possible codons (for example 4³ = 64 different 3-letter "words" can be generated from a 4-letter "alphabet") and only 20 amino acids, several different codons specify the same amino acid, so the genetic code is said to be degenerate. However, the code is unambiguous because each codon specifies only one amino acid. The sequence of codons are not interrupted by "commas" and are always read in the same frame of reference, starting with the same base every time. So the "words" never overlap.

Since DNA never leaves the nucleus, the information it stores is not transferred to the cell directly. Instead, a DNA sequence must first be copied into a messenger RNA molecule, which carries the genetic information from the nucleus to protein assembly sites in the cytoplasm. There it serves as the template for protein construction. The sequences of nucleotide triplets in messenger RNA are also referred to as codons.

Four codons serve special functions. Three are stop codons that signal the end of protein synthesis. The fourth is a start codon which establishes the "reading frame" in which the message is to be read. For example, suppose the message is PAT SAW THE FAT RAT. If we overshoot the reading frame by one "nucleotide," we obtain ATS AWT HEF ATR AT, which is meaningless.

The genetic code is essentially universal. This means that a codon which specifies the amino acid tryptophan in bacteria also codes for it in man. The only exceptions occur in mitochondria and chloroplasts and in some protozoa. (Mitochondria and chloroplasts are sub-cellular compartments which are the sites of respiration in animals and plants, respectively, and contain some DNA.) The structure of the genetic code has evolved to minimize the effect of mutations. Changes in the third base of a codon do not necessarily result in a change in the specified amino acid during protein synthesis. Furthermore, changes in the first base in a codon generally result in the same or at least a similar amino acid. Studies of amino acid changes resulting from mutations have shown that they are consistent with the genetic code. That is, amino acid changes resulting from mutations are consistent with expected base changes in the corresponding codon. These studies have confirmed that the genetic code has been deduced correctly by demonstrating its relevance in actual living organisms.

Additional topics

Science EncyclopediaScience & Philosophy: Cyanohydrins to Departments of philosophy:Deoxyribonucleic Acid (DNA) - History, Structure, Function, Replication Of Dna, The Genetic Code, Expression Of Genetic Information