Logo IMG


The Invention of the Genetic Code

Brian Hayes

The 64-Codon Question

I want to conclude with a question. At the origin of life, the primitive genetic code was surely smaller and simpler than the modern one. It probably included only a few amino acids, or perhaps a few classes of similar amino acids. At some point in its history the code may have functioned as a pure doublet code, ignoring the third base in each codon and specifying no more than 16 amino acids. Then the translation mechanism grew more discriminating, and a few more amino acids were added to the repertory. My question is: Why did this process of differentiation stop at 20 amino acids? There are plenty of spare codons left, and there are other amino acids that need to be gotten into proteins. So why not expand the code further?

One possible answer is that the code is such a vital engine of life that it has been immutable since the earliest stages of evolution. Another answer is that the code is evolving steadily toward greater complexity, and we just happened to have discovered it at the 20-amino acid stage. Maybe our descendants will have 60 kinds of amino acids in their proteins. It's worth noting that 20 does not seem to be a hard-and-fast limit. The codon UGA, which is usually a stop signal, sometimes codes for a 21st amino acid, selenocysteine.

A third possibility is that there really is something special about the numbers 64 and 20. The relation can't be the kind of numerological magic invoked by the comma-free codes, but perhaps there is some property of genetic codes that is optimized when the ratio of amino acids to codons approaches 1:3.

© Brian Hayes

comments powered by Disqus


Of Possible Interest

Engineering: The Story of Two Houses

Letters to the Editors: The Truth about Models

Letters to the Editors: When Horses Fly

Subscribe to American Scientist