# Molecular Computers

## FIELDS OF STUDY

Computer Science; Computer Engineering; Information Technology

SUMMARY

The DNA structure is highly amenable to information storage. A single strand of normally helical DNA, an oligomer, containing 20 base pairs, can exist in 420 or roughly 1.2 trillion, distinctly different arrangements. Synthesis of a given oligomer is not difficult and has been automated. While other molecules had been used as computational substrates, molecular biologist Leonard Adelman gained considerable notoriety by demonstrating how a classic mathematical problem could be formulated as a bonding problem for a small set of DNA oligomers.

## PRINICIPAL TERMS

• Hamiltonian path problem: one of the classic NP-complete problems for which the time of solution grows faster than any power of the number if points involved.
• ligase: an enzyme that facilitates bond formation between two molecular fragments. Also called a polymerase.
• NP-complete problem: The class of problems for which the solution time grows faster than a polynomial of the number of points involved.
• Oligomer: amodest length polymer of DNA, about 20 base pairs long.
• polymerase chain reaction: the Nobel-prize winning discovery by Cary Mullis that thermal cycling can produce numerous copies of a single DNA oligomer; used in DNA fingerprinting as well as DNA computing.
• SAT problem: determining whether a single assignment of true and false values to the k Boolean values in a one or more statements yields a unique solution.
• thermal cycling: bringing a sample repeatedly above the dissociation temperature so that the strands dissociate, and then below that temperature so that replicates form
• Watson-Crick base-pairing: the pairing of adenine and thymine and that of cytosine with guanine in H-bonded complexes in the DNA double helix

In 1994, L. M. Adelman used standard molecular biology techniques to find a Hamiltonian path through a map of 7 interconnected vertices. A Hamiltonian path is one, which passes through each vertex only once and only once. While the actual problem solved might have required a few seconds thought by an insightful high school student, the Hamiltonian path problem is one that mathematicians consider NP complete, one for which the solution time grows faster than any polynomial in the number of vertices. Given a map with a hundreds of cities connected by a network of roads, some one way, it could take years to determine if a Hamiltonian path actually exists.

Adelman's solution involved treating the road map of one-way and two-way streets as a directed graph G of n vertices with a designated beginning and end. Adelman represented each vertex of his graph by a 20-mer chosen at random. These could be synthesized by well-established techniques and amplified using the polymerase chain reaction. Possible paths representing an allowed connection between vertex i and vertex j would be represented by the complement to the last 10 sites in i and the first 10 in j.

1. Generate a large number of paths through G at random. This involved synthesizing oligomers to represent all allowed vertices and allowing them to pair with each other
2. Remove all paths that do not begin at the designated beginning or end at the designated end. This was easily done using standard biochemical techniques.
3. Remove all paths that do not involve exactly n vertices. This is easily accomplished using gel electrophoresis for which the mobility is inversely proportional to the molecular mass.
4. For each vertex v, remove all paths that do not include it

If any DNA is to be found remaining it represents a Hamiltonian path.

THE SAT PROBLEM

The SAT problem involves finding a set of truth values that satisfies a formula in the propositional calculus, in which Boolean variables appear connected by negation, disjunction or conjunction. For instance.

a = (x1.OR..NOT.x2ORx3).AND.(x2.OR.x3).AND. (x1.OR.x3).AND..NOT.x3

is only satisfied if x1 =x2=true, and x3 is false.

For a propositional formula with k variables we construct a graph with 3k +1 points. The vk (k=0,…,k) are assigned to points on the x axis, points to the right and below the axis are assigned to ak1 and the points above the axis are assigned to the ak0. Then a path from v0 to vk going above or below the like indicates a truth assignment for all the k variables.

The first step in Lipton's method consists in generating all paths from v1 to vk.

In analogy to the previous section, assign each on axis vertex to a single stranded oligomer and to each ak0 and ak1. Then a path vi−1aijvi encodes the truth value of xi. Lipton's method begins with a test tube containing all possible paths, then selectively removing all the molecules for which the first clause is not net, the selectively removing all molecules for which the second clause is not met, and so on. When the number of molecules remaining becomes two small the polymerase chain reaction can be run. If at the end of the sequence of reactions the tube becomes empty, there is no solution to the SAT problem.

We see thus that a great many graph theoretic questions can be answered, in principle at least by DNA computation. There will of course be practical difficulties as the size of problems increases, associated with the statistics of binding, but the true potential of DNA computing has yet to be reached.

—Donald Franceschetti, PhD

Adelman, L. M. Molecular computation of solutions to combinatorial problems, Science 226, 1021-1024 (1994).

Calude, C. S., and Gheorge Paun, Computing with Cells and Atoms. (New York: Taylor & Francis, 2001. Print.

Lipton, R. J., and E. B. Baum, eds., DNA Based Computers, DIMACS Series in Discrete Mathematics, and Theoretical Computer Science, 27, American Mathematical Society (1995).