Computing Comes to Life

By Brian Hayes

How to build a computer out of E. coli

Biology Computer Logic Microbiology

This Article From Issue

May-June 2001

Volume 89, Number 3

Page 204

DOI: 10.1511/2001.22.204

People have been recruiting other species to serve human needs for at least 10,000 years. We have turned plants into crops and animals into beasts of burden; even microorganisms have been pressed into service as fermenters. Yet until now no nonhuman species has ever been harnessed to do intellectual work on our behalf. That could change. Biologists and computer scientists have designed digital logic gates based on the metabolism of living cells, with the aim of eventually building a computer out of colonies of Escherichia coli or some other single-celled organism. But perhaps build is the wrong verb here; the plan is to grow or breed or culture a computer.

Ad Right

The idea of a bacterial computer is not in itself quite so outlandish as it may seem on first acquaintance. In principle, computing machines can be made out of almost anything, from billiard balls to Tinker Toys, and there is no reason that lipid sacs of proteins and nucleic acids should not also qualify as computer building blocks. From the lofty and austere perspective of computer science, an agar plate coated with microscopic bacteria is not much different from a silicon wafer etched with microscopic transistors. If the components can store and manipulate information in a few basic ways, they can compute.

So much for the lofty and austere view of computer science; but there is also computer engineering to be considered, and the questions asked in that discipline are more down to earth. Can living logic gates be strung together in networks large enough to perform an interesting computation? Can they run fast enough to complete a task within a human lifetime? Can they be made reliable enough to produce consistent and correct answers? Can biocomputer engineers cope with all the distinctive failure modes of living organisms—disease, predation, parasitism, senescence, death? (In this context the threat of a computer virus is more than a metaphor!) It's fair to say that practical applications of biological computers are a long way off. And yet skeptics might keep in mind that the historical record of domestications is a vast catalogue of unlikely-seeming successes.

A digital technology usually starts with Boolean logic gates—devices that operate on signals with two possible values, such as true and false, 1 and 0. An and gate has two or more inputs and one output; the output is true only if all the inputs are true. An or gate is similar except that the output is true if any of the inputs are true. The simplest of all gates is the NOT gate, which takes a single input signal and produces the opposite value as output: true becomes false, and false becomes true.

In electronic circuits, a NOT gate can be made from a single transistor, wired so that a high voltage at the input produces a low voltage at the output, and vice versa. When the gate switches between its two states, it does so abruptly, like a snap-action light switch. It is this sudden, nonlinear response that gives digital devices their resistance to noise and error. Because a gate is either fully on or totally off, a signal can pass through a long chain of gates without degradation.

Are there any biochemical equivalents to transistor gates? As a matter of fact, yes: There are hundreds of candidates. Perhaps the most interesting among them are the mechanisms of genetic control, which switch genes on and off.

The archetypal example of genetic regulation in bacteria is the lac operon of E. coli, first studied in the 1950s by Jacques Monod and François Jacob. The operon is a set of genes and regulatory sequences involved in the metabolism of certain complex sugars, including lactose. The bacterium's preferred nutrient is the simpler sugar glucose, but when glucose is scarce, the cell can make do by living on lactose. The enzymes for digesting lactose are manufactured in quantity only when they are needed—specifically when lactose is present and glucose is absent.

As in the expression of any genes, synthesis of the lac enzymes is a two-stage process. First the DNA is transcribed into messenger RNA by the enzyme RNA polymerase; then the messenger RNA is translated into protein by ribosomes. The process is controlled at the transcription stage. Before the genes can be transcribed, RNA polymerase must bind to the DNA at a special site called a promoter, which is just "upstream" of the genes; then the polymerase must travel along one strand of the double helix, reading off the sequence of nucleotide bases and assembling a complementary strand of messenger RNA. One mechanism of control prevents transcription by physically blocking the progress of the RNA polymerase molecule. The blocking is done by the lac repressor protein, which binds to the DNA downstream of the promoter region and stands in the way of the polymerase.

When lactose enters the bacterial cell, the lac operon is released from this restraint. A metabolite of lactose binds to the lac repressor, changing the protein's shape and thereby causing it to loosen its grip on the DNA. As the repressor protein drifts away, the polymerase is free to march along the strand and transcribe the operon.

The repressor system is only half of the lac control strategy. Even in the presence of lactose, the lac enzymes are synthesized only in trace amounts if glucose is also available in the cell. The reason, it turns out, is that the lac promoter site is a feeble one, which does a poor job of attracting and holding RNA polymerase. To work effectively, the promoter requires an auxiliary molecule called an activator protein, which clamps onto the DNA and makes it more receptive. Glucose causes the activator to fall away from the DNA just as lactose causes the repressor to let go—but the ultimate effect is the opposite. Without the activator, the lac operon lies dormant.

Brian Hayes

All these tangled interactions of activators and repressors can be simplified by viewing the control elements of the operon as a logic gate. The inputs to the gate are the concentrations of lactose and glucose in the cell's environment. The output of the gate is the production rate of the three lac enzymes. The gate computes the logical function: (lactose AND (NOT glucose)).

Brian Hayes

A question remains: Do these biochemical control mechanisms exhibit the on-off, all-or-nothing character of digital circuits? Although the transition between states is never perfectly sharp, the digital approximation is often a good one. A factor that tends to steepen the response curve is the cooperative action of multiple subunits in the regulatory proteins. The lac repressor consists of four subunits, and the lac activator has two. Although the first subunit may be slow in binding to the DNA, subsequent units stick to one another as well as to the DNA, and so the binding goes faster. The net effect is to make the threshold for repression or activation sharper.

The analogy between metabolic regulators and digital logic was already noticed 40 years ago. In 1961 Monod and Jacob wrote about genetic circuits and switching networks, and they described how activator and repressor proteins could be organized into systems that would function as memory elements and oscillators. Other authors soon explored the connection between molecular biology and digital computing in greater depth and detail; indeed, for several years the theme was a frequent one in the Journal of Theoretical Biology and the Bulletin of Mathematical Biophysics.

The main focus of these early studies was on using digital models as a way of understanding events in the living cell. The Boolean approximation was a way of avoiding an unwieldy analysis of a complex chemical web. To follow all those molecular interactions in complete detail would have required tracking the concentrations of innumerable molecular species, measuring the rates of chemical reactions, and solving hundreds of coupled differential equations. Pretending that every gene is either on or off reduced the problem to a simpler digital abstraction.

The biological computer turns this idea upside down. Instead of constructing a computational model of biochemistry, you exploit quasi-Boolean biochemistry to do computing. This notion also has a history. In the 1970s Otto Rössler analyzed various coupled systems of chemical reactions that could implement the abstract computers called finite automata. More recently, other groups have looked at schemes of computing based on the catalytic activities of enzymes.

The most novel plan for biologically inspired computing was conceived by Leonard M. Adleman of the University of Southern California. His basic idea is to use the complementary base-pairing of DNA as a pattern-matching engine. Adleman himself and others have demonstrated the feasibility of this idea in experiments where vials of DNA carry out computational tasks in number theory and combinatorics.

All of the molecular computing methods mentioned above envision that the computation will be done in vitro. Although the molecules are of biological origin, they are extracted from the cell, and the reaction takes place in laboratory glassware. But why not turn the living cell itself into a computer, powered by its own metabolism? Several research collaborations have done work pointing toward this possibility. Here I shall focus mainly on the ideas of a group at MIT, who have examined the computational aspects of the problem in great detail. The MIT group consists of Thomas F. Knight, Jr., Harold Abelson and Gerald Jay Sussman, and several of their present and former students, including Don Allen, Daniel Coore, Chris Hanson, George E. Homsy, Radhika Nagpal, Erik Rauch and Ron Weiss.

The first major goal of the MIT group is to develop design rules and a parts catalogue for biological computers, like the comparable tools that facilitate design of electronic integrated circuits. An engineer planning the layout of a silicon chip does not have to define the geometry of each transistor individually; those details are specified in a library of functional units, so that the designer can think in terms of higher-level abstractions such as logic gates and registers. A similar design discipline will be needed before biocomputing can become practical.

The elements of the MIT biocomputing design library will be repressor proteins. The logic "family" might be named RRL, for repressor-repressor logic, in analogy with the long-established TTL, which stands for transistor-transistor logic. The basic NOT gate in RRL will be a gene encoding some repressor protein (call it Y), with transcription of the Y gene regulated in turn by a different repressor (call it X). Thus whenever X is present in the cell, it binds near the promoter site for Y and blocks the progress of RNA polymerase. When X is absent, transcription of Y proceeds normally. Because the Y protein is itself a repressor, it can serve as the input to some other logic gate, controlling the production of yet another repressor protein, say Z. In this way gates can be linked together in a chain or cascade.

Going beyond the NOT gate to other logical operations calls for just a little more complexity. Inserting binding sites for two repressor proteins (A and B) upstream of a gene for protein C creates a NOR gate, which computes the negation of the logical OR function. With the dual repressor sites in place, the C gene is transcribed only if both A and B are absent from the cell; if either one of them should rise above a threshold level, production of C stops. In other words, C is transcribed only if neither A nor B is present. The NOR gate is said to be a universal logical element, because any Boolean function can be generated by linking together a series of NOR gates. The NAND gate (NOT AND) is also universal. Thus all that’s really needed to build the information-processing circuitry of a computer is the ability to make and connect NOR gates or NAND gates.

Brian Hayes

Pairs of NAND gates can also be coupled together to form the computer memory element known as a flip-flop, or latch. Implementing this concept in RRL calls for two copies of the genes coding for two repressor proteins, M and N. One copy of the M gene is controlled by a different repressor, R, and likewise one copy of the N gene is regulated by repressor S. The tricky part comes in the control arrangements for the second pair of genes: Here the repressor of M is protein N, and symmetrically the repressor of N is M. In other words, each of these proteins inhibits the other's synthesis. Here's how the flip-flop works. Suppose initially that both R and S are present in the cell, shutting down both of the genes in the first pair; but protein M is being made at high levels by the M gene in the second pair. Through the cross-coupling of the second pair, M suppresses the output of N, with the collateral result that M's own repressor site remains vacant, so that production of M can continue. But now imagine that the S protein momentarily falls below threshold. This event briefly lifts the repression of the N gene in the first pair. The resulting pulse of N protein represses the M gene in the second pair, lowering the concentration of protein M, which allows a little more N to be manufactured by the second N gene, which further inhibits the second M gene, and so on. Thus a momentary change in S switches the system from steady production of M to steady production of N. Likewise a brief blip in R would switch it back again. (S and R stand for "set" and "reset.")

One conclusion to be drawn from this synopsis of a few RRL devices is that a computer based on genetic circuits will need a sizable repertory of different repressor proteins. (I've used up a third of the alphabet already.) Each logic gate inside a cell must have a distinct repressor assigned to it, or else the gates would interfere with one another. In this respect a biomolecular computer is very different from an electronic one, where all signals are carried by the same medium—an electric current. The reason for the difference is that electronic signals are steered by the pattern of conductors on the surface of the chip, so that they reach only their intended target. The biological computer is a wireless device, where signals are broadcast throughout the cell. The need to find a separate repressor for every signal complicates the designer's task, but there is also a compensating benefit. On electronic chips, communication pathways claim a major share of the real estate. In a biochemical computer, communication comes for free.

Are there enough repressor proteins available to create useful computational machinery? Note that interference between logic gates is not the only potential problem; the repressor molecules taking part in the computation must also be distinct from those involved in the normal metabolism of the cell. Otherwise, a physiological upset could lead to a wrong answer; or, conversely, a computation might well poison the cell in which it is running. A toxic instruction might actually be useful—any multitasking computer must occasionally "kill" a process—but unintended events of this kind would be a debugging nightmare. You can't just reboot a dead bacterium.

Nature faces the same problem: A multitude of metabolic pathways have to be kept under control without unwanted crosstalk. As a result, cells have evolved thousands of distinct regulatory proteins. Moreover, the biocomputing engineer will be able to mix and match among molecules and binding sites that may never occur together in the natural world. The aim of the RRL design rules is to identify a set of genes and proteins that can be encapsulated as black-box components, to be plugged in as needed without any thought about conflicts.

Another important design tool is a simulator, which allows a device to be tested without the substantial effort of building a prototype. The world of electronics has long relied on a simulator called Spice, which models the physics of transistors and other electronic components. The MIT group is building a BioSpice simulator, which will model the dynamics of genetic circuits in a similar way.

So far, the MIT group has based their design work primarily on such simulations, but other groups have begun a few "wet" experiments. Michael B. Elowitz and Stanislas Leibler of Princeton University have created a free-running genetic oscillator in E. coli. Arranging three repressor genes so that they act on one another in turn, they observed periodic fluctuations in gene expression, with a frequency independent of the cell's reproductive cycle. In another E. coli experiment, James J. Collins, Timothy S. Gardner and Charles R. Cantor of Boston University built a genetic toggle switch much like the flip-flop described above, with two cross-coupled promoters and repressors. They report "robust bistability." Their eventual aim is the construction of "genetic applets"—self-contained program modules that could be "downloaded" into organisms.

Persuading a living cell to perform useful computations is quite a trick, and yet it's not all that's needed. The real goal is to get a billion cells working in concert on the same task.

Image reproduced with permission of the MIT Artificial Intelligence Laboratory

From one point of view, mass producing bacteria is extraordinarily easy. You don't have to build a billion-dollar "fab line" to manufacture them; just supply warmth and nutrients, and the bacteria will take care of proliferation on their own. The hard part is organizing a population of cells so that they work toward some specified goal. Here again electronic and biological technologies diverge. On a silicon chip, every circuit element has an assigned place and function, but living cells are squishy and motile and not easily confined to a rigid grid. The MIT group therefore takes another cue from biology, and lets large-scale structures emerge from processes akin to natural development and differentiation.

In an embryo, cells of identical genetic endowment differentiate into distinct tissues and organs, and also generate patterns such as the stripes and spots of animal pelts. What is most intriguing about biological development is that all the cells begin with the same "program," and they organize themselves without any externally designated leader. It all seems to be done by means of short-range communication between neighboring cells and the diffusion of chemical signals over longer distances. These same mechanisms would also be available to a multicellular biocomputer.

The study of large arrays of simple processing elements is a classical topic in computer science, but for the most part the arrays have been geometrically regular, and the processors have operated in strict synchrony. The MIT group offers a new paradigm of "amorphous computing" by spatially irregular and unsynchronized arrays. If all the processors run the same program, and they have only local communication, what patterns can emerge in such an amorphous blob of computers? Some of the examples generated so far have a distinctly botanical look to them, and yet they also resemble the design drawings for a silicon integrated circuit.

Many further hurdles remain before biocomputing could become a practical technology. Input and output are problematic. Maybe the input device will be a pipette of pheromone, and fluorescent proteins could produce output signals, but the expressive possibilities of these facilities seem rather limited. Large-capacity long-term storage—a biological disk drive—is also lacking. And speed is a concern, even with the extraordinary level of parallelism implicit in the exponential growth of a bacterial colony. Silicon processors are running at a gigahertz, but the speed of genetic circuits is in the millihertz range. Even a billion bacteria are no match for a Pentium.

But surely it would be a mistake to think of the E. coli computer as a beige box that will sit on your desk running a prokaryotic version of Microsoft Windows. A more likely prospect is a crop of programmable biological sensors, actuators and messengers. One contemplated application of such organisms is the assembly of nanoscale structures; instead of replacing semiconductor circuits, the cells would fabricate them. Another possibility is the old fantasy of a microscopic robot that could enter the human body to repair diseased tissues or combat infections. If this daydream of an intravenous computer is ever to happen, success seems more likely with the tools of genetic engineering than with a soldering iron.

Or maybe not. Maybe in the end it's just foolishness to imagine that anything so intellectually demanding as computation could be imposed on a biological substrate. No living organism can be expected to engage in abstract reasoning and symbol manipulation while carrying on with the daily routine of ingestion, growth, excretion, sleep, procreation. Get a life!

American Scientist Comments and Discussion

To discuss our articles or comment on them, please share them and tag American Scientist on social media platforms. Here are links to our profiles on Twitter, Facebook, and LinkedIn.

If we re-share your post, we will moderate comments/discussion following our comments policy.

American Scientist