1 minute read

Human Genome Project

The Draft Sequence



When the draft sequence was published in February 2001, scientists sequenced each chromosome four to five times to be certain of the accuracy of their nucleotide base calling. It was called a draft because the chromosomal locations are roughly approximated. In order to determine the order of the sequence, DNA is cut into fragments using restriction enzymes. These fragments on the order of approximately 10kb (or 10,000 base pairs) are cloned into vectors (for example, BACs) by cutting open the circular bacterial DNA vector using the same enzymes that cut the DNA, ligating the fragment to the end of the vector, growing up the BACs in culture, and sequencing the inserts. Overlapping fragments in which the DNA sequences matched were carefully pieced together. The ends of BACs, therefore, were used as markers that were found in roughly every 3,000 to 4,000 bases throughout the entire genome called sequence tag connectors (STCs). In this large-scale sequencing effort, STCs provided a compass for knowing which specific BAC clones had to be sequenced to fill in gaps between STCs. The next step was to produce a higher quality sequence of approximately 95.8% of the human genome sequence that was projected to be completed sometime in 2003. This version involves an error rate of only one base per 10,000 bases, requiring additional sequencing and filling in of any gaps with coverage of up to 9 times base calling. Final sequences are publicly available in databases such as GenBank.




Additional topics

Science EncyclopediaScience & Philosophy: Heterodyne to Hydrazoic acidHuman Genome Project - The Goals Of The Human Genome Project, Dna Sequencing Methodology, The Draft Sequence, The Dna Sequence: Is It Informative? - The timeline