Logo IMG
HOME > PAST ISSUE > Article Detail


The Beginnings of Life on Earth

Christian de Duve

The RNA World

Whatever the earliest events on the road to the first living cell, it is clear that at some point some of the large biological molecules found in modern cells must have emerged. Considerable debate in origin-of-life studies has revolved around which of the fundamental macromolecules came first—the original chicken-or-egg question.

The modern cell employs four major classes of biological molecules—nucleic acids, proteins, carbohydrates and fats. The debate over the earliest biological molecules, however, has centered mainly on the nucleic acids, DNA and RNA, and the proteins. At one time or another, one of these molecular classes has seemed a likely starting point, but which? To answer that, we must look at the functions performed by each of these in existing organisms.

The proteins are the main structural and functional agents in the cell. Structural proteins serve to build all sorts of components inside the cell and around it. Catalytic proteins, or enzymes, carry out the thousands of chemical reactions that take place in any given cell, among them the synthesis of all other biological constituents (including DNA and RNA), the breakdown of foodstuffs and the retrieval and consumption of energy. Regulatory proteins command the numerous interactions that govern the expression and replication of genes, the performance of enzymes, the interplay between cells and their environment, and many other manifestations. Through the action of proteins, cells and the organisms they form arise, develop, function and evolve in a manner prescribed by their genes, as modulated by their surroundings.

The one thing proteins cannot do is replicate themselves. To be sure, they can, and do, facilitate the formation of bonds between their constituent amino acids. But they cannot do this without the information contained within the nucleic acids, DNA and RNA. In all modern organisms, DNA serves as the storage site of genetic information. The DNA contains, in encrypted form, the instructions for the manufacture of proteins. More specifically, encoded within DNA is the exact order in which amino acids, selected at each step from 20 distinct varieties, should be strung together to form all of the organism's proteins. In general, each gene contains the instructions for one protein.

DNA itself is formed by the linear assembly of a large number of units called nucleotides. There are four different kinds of nucleotides, designated by the initials of their constituent bases: A (adenine), G (guanine), C (cytosine) and T (thymine). The sequence of nucleotides determines the information content of the molecules, as does the sequence of letters in words.

Within all cells, DNA molecules are formed from two strands of DNA that spiral around each other in a formation called a double helix. The two strands are held together by bonds between the bases of each strand. Bonding is quite specific, so that A always bonds with T, and G is always partnered with C on the opposite DNA strand. This complementarity is crucial for faithful replication of the DNA strands prior to cell division.

During DNA replication, the DNA strands are separated, and each strand serves as a template for the replication of its complementary strand. Wherever A appears on the template, a T is added to the nascent strand. Or, if T is on the template, then A is added to the growing strand. The same is true for G and C pairs. In the characteristic double-helical structure of DNA, the two strands carry the same information in complementary versions, as do the positive and negative of the same photograph. Upon replication, the positive strand serves as template for the assembly of a new negative and the negative strand for that of a new positive, yielding two identical duplexes.

In order for DNA to fulfill its primary role of directing the construction of proteins, an intermediate molecule must be made. DNA does not directly participate in protein synthesis. That is the function of its very close chemical relative RNA.

Expression of DNA begins when an RNA molecule is constructed bearing the information for a gene contained on the DNA molecule. RNA, like DNA, is made up of nucleotides, but U (uracil) takes the place of T. Construction of the RNA molecule follows the same rules as DNA replication. The RNA copy, called a transcript, is a complementary copy of the DNA, with U (instead of T) inserted wherever A appears on the DNA template.

Most RNA transcripts, often after some modification, provide the information for the assembly of proteins. The sequence of nucleotides along the coding RNA, aptly called messenger RNA, specifies the sequence of amino acids in the corresponding protein molecule—three successive nucleotides (called a codon) in the RNA specify one amino acid to be used in the protein. The process is known as translation, and the correspondences between codons and amino acids define the genetic code.

Not all RNA molecules are messengers, however. Some of the RNAs participate in protein synthesis in other ways. Some actually make up the cellular machinery that constructs proteins. These are called ribosomal RNAs, and they may include the actual catalyst that joins amino acids by peptide bonds, according to the work of Harry Noller at the University of California at Santa Cruz. Other RNAs, called transfer RNAs, ferry the appropriate amino acids to the ribosome. As cell biology has progressed, even more functions for RNA have been discovered. For example, some RNA molecules participate in DNA replication, while others help process messenger RNAs.

Scientists considering the origins of biological molecules confronted a profound difficulty. In the modern cell, each of these molecules is dependent on the other two for either its manufacture or its function. DNA, for example, is merely a blueprint, and cannot perform a single catalytic function, nor can it replicate on its own. Proteins, on the other hand, perform most of the catalytic functions, but cannot be manufactured without the specifications encoded in DNA. One possible scenario for life's origins would have to include the possibility that two kinds of molecules evolved together, one informational and one catalytic. But this scenario is extremely complicated and highly unlikely.

The other possibility is that one of these molecules could itself perform multiple functions. Theorists considering this possibility started to look seriously at RNA. For one thing, the molecule's ubiquity in modern cells suggests that it is a very ancient molecule. It also appears to be highly adaptable, participating in all of the processes relating to information processing within the cell. For a while, the only thing RNA did not seem capable of doing was catalyzing chemical reactions.

That view changed when in the late 1970s, Sydney Altman at Yale University and Thomas Cech at the University of Colorado at Boulder independently discovered RNA molecules that in fact could catalytically excise portions of themselves or of other RNA molecules. The chicken-or-egg conundrum of the origin of life seemed to fall away. It now appeared theoretically possible that an RNA molecule could have existed that naturally contained the sequence information for its reproduction through reciprocal base pairing and could also catalyze the synthesis of more like RNA strands.

In 1986, Harvard chemist Walter Gilbert coined the term "RNA world" to designate a hypothetical stage in the development of life in which "RNA molecules and cofactors [were] a sufficient set of enzymes to carry out all the chemical reactions necessary for the first cellular structures." Today it is almost a matter of dogma that the evolution of life did include a phase where RNA was the predominant biological macromolecule.

comments powered by Disqus


Subscribe to American Scientist