All through the 1930s, members of the famous
Drosophila group at Caltech roamed the American
West collecting fruit flies for genetic analysis. One
discovery to come from these expeditions was an abundance of
genetic inversions: blocks of genes that had flipped
end-over-end. Flies from one geographic region might have a
certain set of genes in the order a b c d e f
g, but a population elsewhere could harbor the
sequence d c b a e f g, with the first
four genes inverted. A further reversal, affecting a
different block of genes, could produce an ordering such as
d c f e a b g. Theodosius Dobzhansky and
Alfred H. Sturtevant, two of the leading
Drosophilists,pointed out that such genetic
rearrangements could help in reconstructing the family tree
of the flies. More reversals would indicate greater
evolutionary distance.
The variations discovered by
Dobzhansky and Sturtevant could be explained by reversing
just one or two blocks of genes. Later, when gene order was
studied in a broader range of organisms, more complex
patterns emerged. In the 1980s Jeffrey D. Palmer and Laura A.
Herbon of the University of Michigan were measuring the pace
of evolutionary change in plants of the cabbage family.
Looking at the DNA in mitochondria (the energy-producing
organelles), they found that the genes had been jumbled by
multiple random reversals. Transforming cabbage into turnip
took at least three reversals. More distant relatives such
as cabbage and mustard appeared to be separated by a dozen
or more reversal events—they could only estimate how
many.
If these genetic flip-flops are to serve as an
evolutionary clock, we need a reliable way to count them.
Given two arrangements of a set of genes—say
a b c d e f g and
f e b a g c d—how
do you determine what sequence of reversals produced the
transformation? This example has a three-step solution, which
you can probably find in a few minutes with pencil and
paper. For larger genomes and longer chains of reversals,
however, trial-and-error methods soon falter. Is there an
efficient algorithm for identifying a sequence of reversals
that converts one permutation into another?
The genetic
reversal problem lies at the intersection of biology,
mathematics and computer science. For some time, the prospects
for finding a simple and efficient solution seemed dim, even
with the most powerful tools of all three disciplines. But
the story has a happy ending. A little more than a decade
ago, computing gene reversals was still a subtle research
problem; now it can be done with such ease that it's a
matter of routine technology. If you need to know the
"reversal distance" between two genomes, you can
go to a Web site and get the answer in seconds.