11 minute read


What Proteins Do, Protein Structure, Designer Proteins

Proteins are linear chains of amino acids connected by chemical bonds between the carboxyl group of each amino acid and the amine group of the one following. These bonds are called peptide bonds, and chains of only a few amino acids are referred to as polypeptides rather than proteins. Different authorities set the protein/polypeptide dividing line at anywhere from 10 to 100 amino acids.

Many proteins have components other than amino acids. For example, some may have sugar molecules chemically attached. Exactly which types of sugars are involved and where on the protein chain attachment occurs will vary with the specific protein. In a few cases, it may also vary between different people. The A, B, and O blood types, for example, differ in precisely which types of sugar are or are not added to a specific protein on the surface of red blood cells.

Other proteins may have fat-like (lipid) molecules chemically bonded to them. These sugar and lipid molecules are always added after synthesis of the protein's amino acid chain is complete. As a result, discussions of protein structure and synthesis—including this one—may virtually ignore them. Nevertheless, such molecules can significantly affect the protein's properties.

Many other types of molecules may also be associated with proteins. Some proteins, for example, have specific metal ions associated with them. Others carry small molecules that are essential to their activity. Still others associate with nucleic acids in chromosomal or ribosomal structures.

Primary structure: peptide-chain synthesis

Proteins are made (synthesized) in living things according to "directions" given by DNA and carried out by RNA and proteins. The synthesized protein's linear sequence of amino acids is ultimately determined by the linear sequence of DNA bases—or of base triplets known as codons—in the gene that codes for it. Each cell possesses elaborate machinery for producing proteins from these blueprints.

The first step is copying the DNA blueprint, essentially fixed within the cell nucleus, into a more mobile form. This form is messenger ribonucleic acid (mRNA), a single-stranded nucleic acid carrying essentially the same sequence of bases as the DNA gene. The mRNA is free to move into the main part of the cell, the cytoplasm, where protein synthesis takes place.

Besides mRNA, protein synthesis requires ribosomes and transfer ribonucleic acid (tRNA). Ribosomes are the actual "factories" where synthesis takes place, while tRNA molecules are the "trucks" that bring amino acids to the ribosome and ensure that they are incorporated at the right spot in the growing chain.

Ribosomes are extremely complex assemblages. They comprise almost 70 different proteins and at least three different types of RNA, all organized into two different-sized subunits. As protein synthesis begins, the previously separate subunits come together at the beginning of the mRNA chain; all three components are essential for the synthetic process.

Transfer RNA molecules are rather small, only about 80 nucleotides long. (Nucleotides are the fundamental building blocks of nucleic acids, as amino acids are of proteins.) Each type of amino acid has at least one corresponding type of tRNA (sometimes more). This correspondence is enforced by the enzymes that attach amino acids to tRNA molecules, which "recognize" both the amino acid and the tRNA type and do not act unless both are correct.

Transfer RNA molecules are not only trucks but translators. As the synthetic process adds one amino acid after another, they" read" the mRNA to determine which amino acid belongs next. They then bring the proper amino acid to the spot where synthesis is taking place, and the ribosome couples it to the growing chain. The tRNA is then released and the ribosome then moves along the mRNA to the next codon—the next base triplet specifying an amino acid. The process repeats until the "stop" signal on the mRNA is reached, upon which the ribosome releases both the mRNA and the completed protein chain and its subunits separate to seek out other mRNAs.

Secondary structure

The two major types of secondary structure are the alpha helix and the beta sheet, both discovered by Linus Pauling and R. B. Corey in 1951. (Pauling received the first of his two Nobel Prizes for this discovery.) Many scientists consider a structure known as the beta turn part of secondary structure, even though the older techniques used to identify alpha helices and beta sheets cannot detect it. For completeness, some authorities also list random coil—the absence of any regular, periodic structure-as a type of secondary structure.

ALPHA HELIX. In an alpha helix, the backbone atoms of the peptide chain—the carboxyl carbon atom, the a-carbon atom (to which the side chain is attached), and the amino nitrogen atom—take the form of a three-dimensional spiral. The helix is held together by hydrogen bonds between each nitrogen atom and the oxygen atom of the carboxyl group belonging to the fourth amino acid up the chain. This arrangement requires each turn of the helix to encompass 3.6 amino acids and forces the side chains to stick out from the central helical core like bristles on a brush.

Since amino acids at the end of an alpha helix cannot form these regular hydrogen bonds, the helix tends to become more stable as it becomes longer-that is, as the proportion of unbonded "end" amino acids becomes smaller. However, recent research suggests that most alpha helices end with specific "capping" sequences of amino acids. These sequences provide alternative hydrogen-bonding opportunities to replace those unavailable within the helix itself.

BETA SHEET. Beta sheets feature several peptide chains lying next to each other in the same plane. The stabilizing hydrogen bonds are between nitrogen atoms on one chain and carboxyl-group oxygen atoms on the adjacent chain. Since each amino acid has its amino group hydrogen-bonded to the chain on one side and its carboxyl group to the chain on the other side, sheets can grow indefinitely. Indeed, as with alpha helices, the sheet becomes more stable as it grows larger.

The backbone chains in a beta sheet can all run in the same direction (parallel beta sheet) or alternate chains can run in opposite directions (antiparallel beta sheet). There is no significant difference in stability between the types, and some real-world beta sheets mix the two. In each case, side chains of alternate amino acids stick out from alternate sides of the sheet. The side chains of adjacent backbone chains are aligned, however, creating something of an accordion-fold effect.

BETA TURN. Many antiparallel beta sheets are formed by a single peptide chain continually looping back on itself. The loop between the two hydrogen-bonded segments, known as a beta turn, consistently contains one to three (usually two) amino acids. The amino acids in a beta turn do not form hydrogen bonds, but other interactions may stabilize their positions. A further consistency is that, from a perspective where the side chain of the final hydrogen-bonded amino acid projects outward toward the viewer, the turn is always to the right.

Tertiary structure and protein folding

Within seconds to minutes of their synthesis on ribosomes, proteins fold up into an essentially compact three-dimensional shape-their tertiary structure. Ordinary chemical forces fully determine both the steps in the folding pathway and the stability of the final shape. Some of these forces are hydrogen bonds between side chains of specific amino acids. Others involve electrical attraction between positively and negatively charged side chains. Perhaps most important, however, are what are called hydrophobic interactions—a scientific restatement of the observation that oil and water do not mix.

Some amino acid side chains are essentially oil-like (hydrophobic-literally, "water-fearing"). They accordingly stabilize tertiary structures that place them in the interior, largely surrounded by other oil-like side chains. Conversely, some side chains are charged or can form hydrogen bonds. These are hydrophilic, or "water-loving," side chains. Unless they form hydrogen or electro-static bonds with other specific side chains, they will stabilize structures where they are on the exterior, interacting with water.

The forces that govern a protein's tertiary structure are simple. With thousands or even tens of thousands of atoms involved, however, the interactions can be extremely complex. Today's scientists are only beginning to discover ways to predict the shape a protein will assume and the folding process it will go through to reach that shape.

Recent studies show that folding proceeds through a series of intermediate steps. Some of these steps may involve substructures not preserved in the final shape. Furthermore, the folding pathway is not necessarily the same for all molecules of a given protein. Individual molecules may pass through any of several alternative intermediates, all of which ultimately collapse to the same final structure.

The stability of a three-dimensional structure is not closely related to the speed with which it forms. Indeed, speed rather than stability is the main reason that egg white can never be "uncooked." At room temperature or below, the most stable form of the major egg white protein is compact and soluble. At boiling-water temperatures, the most stable form is an extended chain. When the cooked egg is cooled, however, the proteins do not have time to return to their normal compact structures. Instead, they collapse into an aggregated, tangled mass. And although this tangled mass is inherently less stable than the protein structures in the uncooked egg white, it would take millions of years—effectively forever—for the chains to untangle themselves and return to their soluble states. In scientific terminology, the cooked egg white is said to be metastable.

Something very similar could happen in the living cell. That it rarely does so reflects eons of evolution: selection has eliminated protein sequences likely to get trapped in a metastable state. Mutations can upset this balance, however. In the laboratory, scientists have produced many mutations that disrupt a protein's tertiary structure; either rendering it unstable or allowing it to become trapped in a metastable state. In the body, some scientists suspect that cystic fibrosis and an inherited bone disease called osteogenesis imperfecta may be due to mutations interfering with protein folding. And some believe that Alzheimer disease may also be due to improper protein folding, although not because of a mutation.

Scientists were recently surprised to discover that some proteins require an additional mechanism to ensure that they fold properly: association with other proteins. Since a protein's primary sequence completely determines its tertiary structure—as Christian Anfinsen and his National Institutes of Health colleagues had shown in a classic 1960 study—external mechanisms were not anticipated.

Sometimes the associated proteins become part of the final protein complex; in effect, quaternary structure forms before the final tertiary structure. In other instances, folding is assisted by a class of proteins known as chaperonins that dissociate when the process is complete. No one knows the precise role chaperonins play; it may not be the same in all cases. Scientists suspect, however, that one major chaperonin role may be to steer target proteins away from aggregation or other metastable states in which they might become trapped.

Quaternary structure, cooperativity, and hemoglobin

Some proteins have no quaternary structure. They exist in the cell as single, isolated molecules. Others exist in complexes encompassing anywhere from two to dozens of protein molecules belonging to any number of types.

Proteins may exhibit quaternary structure for a variety of reasons. Sometimes several proteins must come together to carry out a single function, or to perform it efficiently, without the substances on which they all act having to diffuse halfway across the cell. At other times the reasons are at least partially structural; for example, several proteins may come together to form an ion channel long enough to reach across the cell membrane. The most interesting reason, however, is that association allows changes to one molecule to affect the shape and activity of the others. Hemoglobin provides an intriguing example of this.

Hemoglobin, which makes up about a third of red blood cells' weight, is the protein that transports oxygen from the lungs to the tissues where it is used. It would be a major oversimplification, but not entirely false, to say that the protein (globin) part of hemoglobin is simply a carrier for the associated heme group.

Heme is a large "ring of rings" comprising 33 carbon, 4 nitrogen, 4 oxygen, and 30 hydrogen atoms. In the center, bonded to the four nitrogen atoms, is an iron atom; attraction between this iron atom and a histidine side chain on the globin is one of several forces holding the heme in place. Another histidine side chain is located slightly further from the iron atom, allowing an oxygen molecule to insert itself reversibly into the gap. In similar proteins lacking this histidine, oxygen alters the iron's oxidation state rather than attaching to it.

Hemoglobin consists of two copies of each of two slightly different protein molecules. All four molecules are in intimate contact with each other; thus, it is easy to see how a change in the shape of one could encourage the others to change shape as well. In fact, that is exactly what happens. When oxygen binds to one hemoglobin molecule, it forces a slight change in that molecule's shape. This change, in turn, alters the other molecules' shape so that oxygen binding is more likely. The end result is that any given hemoglobin tetramer (four-molecule complex) almost always carries either four oxygen molecules or none.

This "cooperativity," discovered by Coryell and Pauling in 1939, is extremely important for hemoglobin's function in the body. In the lungs, where there is a great deal of oxygen, binding of an oxygen molecule is quite likely. This leads almost immediate binding of three more oxygen molecules, so hemoglobin is nearly saturated with oxygen as it leaves the lungs. In the tissues, where there is less oxygen, the chance that an oxygen molecule will leave the hemoglobin tetramer becomes quite high. As a result, the other three oxygen molecules will be bound less tightly and will probably leave also. The final consequence is that most of the oxygen carried to the tissues will be released there.

Without cooperativity, hemoglobin would pick up less oxygen in the lungs and release less in the tissues. Overall oxygen transport would therefore be less efficient.

Additional topics

Science EncyclopediaScience & Philosophy: Propagation to Quantum electrodynamics (QED)