THE DNA EXPLOSION
A QUESTION OF TIMING
One of the stereotypically tedious parts of school history classes is committing to memory parts of the historical chronicle, that is, lists of events and their associated dates, all placed in chronological order. The Battle of Hastings in 1066, Columbus’s discovery of the New World in 1492, the signing of the Declaration of Independence on July 4, 1776—these are all familiar entries in the chronicle. Although memorizing such dates may be tiresome, knowledge of the chronicle is clearly vital to making sense of history, to understanding how and why things have happened. Just consider the sorts of historical connections we might contemplate if we had no idea about the proper chronological order of events. We might imagine that Magellan used Captain Cook’s maps to chart a route across the Pacific, or wonder if the financial meltdown of 2008 helped promote the rise of Hitler and the Nazi Party.
For human history, especially recent human history, such examples sound silly; the chronicle is typically so well established that we don’t waste time considering connections that violate the sequence. However, if history is considered in the most general sense—the history that includes cosmology, geology, evolutionary biology, linguistics, and other areas—there are many cases in which the absolute and relative timing of events is not an obvious collection of facts, but instead is something quite difficult to establish. For instance, starting with the first strong evidence for the Big Bang Theory, in the 1920s, estimates of the age of the universe have ballooned in a series of steps, increasing from about 2 billion years to 13.8 billion years.22 Similarly, although we can be sure that human language arose after our evolutionary separation from the chimp lineage, support for a more precise age for this critical event has been elusive. Obviously, there’s no written record of the Big Bang, or the origin of human language, and the evidence that does exist for dating these events can be hard to interpret. As a result, the possible chains of cause and effect are also often unclear. It has been suggested, for example, that the origin of language gave rise to selection for increased brain size, but that connection remains speculative, in part because of the uncertainty about exactly when language (or, more precisely, certain steps in the evolution of language) arose. Continuing our Hitler analogy, it’s as if we are often operating without clear knowledge of whether the 2008 financial crisis came before or after the Third Reich.23
Historical biogeography, like all historical disciplines, would benefit greatly from having an established chronicle of relevant events. More specifically, the problem of explaining piecemeal distributions fairly screams out for such timing information on the age of evolutionary branching points. In many of these cases, the competing explanations involve processes taking place at different periods; typically, one is an ancient vicariance event, such as the opening of the Atlantic Ocean, and another involves the more recent dispersal of organisms across an ocean or some other barrier. So, if you knew when two lineages split—say, a group of rodents living in South America and another in Africa—there’s a good chance you would be able to either reject or support the ancient vicariance explanation; the split could be too recent to be explained by vicariance, or, alternatively, it could be old enough to fit that hypothesis. Biogeographers of all persuasions agree that having accurate ages for branching points in the tree of life would be enormously useful; all agree that knowing when would go a long way toward figuring out how. What they have conspicuously failed to agree upon is the practical role of such timing information in real studies of biogeography.
In the early 1990s, when Mike Pole was forming his ideas about the origins of the New Zealand flora, historical biogeography was deeply divided over this issue. It was like a nation polarized into two warring political parties, along with a large number of people of undecided allegiance. On one side were the hard-core vicariance scientists, including Gary Nelson and other cladists, along with panbiogeographers like Michael Heads, whose focus was on cladograms (without connected age information) or tracks as the fundamental kinds of evidence. They were notably disinterested in using fossils to place ages on evolutionary groups, a disinterest stemming from their belief that the fossil record is too incomplete to provide useful information for that purpose. In the extreme, these were the scientists who were entertaining ancient vicariance explanations even for such things as the origins of the Hawaiian biota or the distribution of Homo sapiens. They typically believed that the only good way to infer the age of an evolutionary branching point was to connect it to some fragmentation event, tectonic or otherwise. One could “know,” for instance, that Australian and New Zealand southern beeches had split from each other roughly 80 million years ago if, by looking at cladograms, it was established (and I use that term loosely) that their separation had been caused by Gondwanan breakup. In this view, time—the age of a branching point—was never used to discriminate between dispersal and vicariance, but instead was an outcome of already “knowing” that vicariance was the explanation.
On the other side were those who thought they had a rough handle on when many groups first appeared on Earth (and in particular places) and were willing to use that information to interpret biogeographic history. Think of those classroom posters that show the geologic time periods with the history of life superimposed on them—the first insects (crawling next to the word “Silurian”), the first mammals (Triassic), the first birds (Jurassic), and so on. This second group of scientists basically was made up of believers in such timelines, at least as approximations. Not surprisingly, most of them either were paleontologists or had a strong interest in the fossil record. They were people like Mike Pole and Dallas Mildenhall, immersed in the paleobotanical record of Zealandia; Anthony Hallam, a paleontologist who studied molluscs and other shelled invertebrates; and John Briggs, who did research on both living and fossil fishes. They weren’t ignorant about the effects of continental drift; in fact, both Hallam and Briggs wrote books about plate tectonics and its revolutionary impact on biogeography. However, all of them were arguing, as New York School dispersalists like William Diller Matthew, George Gaylord Simpson, and Ernst Mayr had decades before, that the fossil record tells us that some groups—a lot of groups, actually—are probably too young to have been split up by ancient fragmentation events. For example, Briggs, in his 1987 book Biogeography and Plate Tectonics, suggested that a group of freshwater killifishes found in Africa and South America might have come into existence long after the opening of the Atlantic. If that was the case, at least one of these fishes must have crossed the ocean, somehow tolerating the salty waters. Like others in this school of thought, Briggs had no particular preference for vicariance or dispersal as explanations; he would follow the evidence, and for him, the evidence included information about the ages of groups.
At the same time, many biologists who had or might have had some interest in biogeography were not in either camp. Some of these people weren’t ready to accept the extreme views of the vicariance side, yet were also leery of putting too much faith in the fossil record. Michael Donoghue, the botanist who had been attracted to Gary Nelson’s intellectual boldness, was one of those. Despite being “raised” as a cladist, he was turned off by the endless, inconclusive cladograms in Nelson and Platnick’s vicariance tome, Systematics and Biogeography—“chicken scratchings,” Donoghue called them—yet he wasn’t ready to believe, à la Pole and Mildenhall, that long-distance dispersal was commonplace. Up to the early 1990s, Donoghue had mostly set aside an early interest in biogeography and was pursuing other things. Others had never really been drawn in to begin with, perhaps because the field, after the burst of enthusiasm following the revelation of continental drift, seemed a bit stagnant. However, I suspect that most of these “undecideds,” when they were thinking about biogeography at all, had leanings toward vicariance rather than dispersal, because vicariance seemed like the more global and cutting-edge idea. As I mentioned in the Introduction, I knew little about biogeography through the 1990s, but when I had to lecture about the subject in an evolution course, I chose to talk mostly about Gondwanan breakup, not ocean crossings. Global fragmentation just seemed like the “cooler” thing to focus on. In short, the undecided vote seemed, if anything, poised to tip toward the vicariance side.
What shifted this balance, particularly for the undecideds, was the use of molecular data, especially DNA sequences, to put ages on evolutionary branching points. Molecular dating had actually begun long before this, in the early 1960s,24 but such studies became much more widespread in the 1990s, and that upward trend has continued to the present. For many people, this approach suddenly made establishing the evolutionary chronicle a reality. It was like finally being able to show, after years of ignorance, that Hitler really had come to power decades before the financial crisis of 2008.
Given the importance of molecular dating for biogeography, a huge question that we have to deal with is whether the ages estimated in this way can really be trusted. Some evolutionary biologists, including, not surprisingly, some of the hard-core vicariance crowd, continue to think that molecular dating is basically worthless, and therefore, that any conclusions that depend on it are equally worthless. In a 2005 paper, Michael Heads wrote that “degree of divergence is a guide neither to the time involved in evolution, nor the age of that evolutionary [splitting] event,” and that the molecular clock approach “does not solve biogeographical problems but simply leads into a morass of mysteries and paradoxes.” Similarly, Gary Nelson, now retired but still quick with a witty phrase, has ridiculed the approach and its use in biogeography as a futile “molecular dating game.”
In Chapter Six, I will get into the thorny but critical issue of whether we should trust molecular dating studies. First, however, I want to address, somewhat idiosyncratically, the question of why this approach took off when it did. In a sense, there can never be a complete answer to a question like that; one can always delve deeper into the long sequence of historical cause and effect, or flesh out in greater detail what happened at key points. In the case of the molecular dating explosion, one could argue for the importance of things like the discovery of the structure of DNA, and, later, of the enzymes that make strands of DNA replicate themselves. The idea of the molecular clock itself, first proposed by an Austrian biochemist named Emile Zuckerkandl and the Nobel Prize–winning chemist Linus Pauling in 1962, and the invention in the 1970s of methods for obtaining long sequences of DNA, were also critical. However, I take all that as background and instead focus on a critical insight that a particular scientist had in the early 1980s. In doing so, I’m not subscribing to a “great man/woman” view of history. Rather, I’m simply emphasizing an event that clearly had a rapid and far-reaching effect. This event may also qualify as a potential “point of no return,” that is, an occurrence that set off an unavoidable cascade of effects. Maybe I’m also biased by the fact that I experienced part of the effect of the event in question firsthand, within a few years of when it happened.
Such scientific turning points are not always memorably discrete, even to the people making them. For instance, although there may have been a particular moment when Darwin became a confirmed believer in evolution, his thoughts on the subject had been percolating for years; when he finally converted, it was like fitting a few pieces into a puzzle whose basic form he had already seen. However, if we can take the word of its architect, the turning point I’m about to describe did indeed come in a lightning-like epiphany. That epiphany, that flash point, can be located very precisely in time and space, to a night in May 1983, on Highway 128 in the Coast Range north of San Francisco, at mile marker 46.58.
THE CHAIN REACTION BEGINS
It was an unseasonably warm night, and the air was thick with the sweet scent of blooming California buckeyes. Kary Mullis was at the wheel of his silver Honda Civic, driving north from Berkeley toward his cabin in Anderson Valley, his girlfriend asleep in the passenger seat. He was thinking about DNA replication.
Mullis and his girlfriend were both chemists working for Cetus, a Bay Area biotech company that, among other things, developed cancer therapies and ways to diagnose genetic diseases such as sickle-cell anemia. Mullis was hardly your stereotyped dull corporate scientist in a white lab coat, however. In fact, he was (and is) a risk-taker and close to a certifiable nut. He had experimented with LSD and other hallucinogenic drugs, even brewing up new compounds and trying them out on himself. (A Berkeley professor in whose laboratory Mullis worked while getting his doctorate had once suggested, with surprising restraint, that Mullis might want to clear the psychoactive substances out of the lab freezer, in case the cops came around.) In Aspen, Colorado, he once skiied down the middle of an icy road with cars whizzing by on both sides, apparently unconcerned because he had it in his head that he would eventually be killed by crashing into a redwood tree, and there weren’t any redwoods in Aspen. Weirder still, he had once passed out while getting high on laughing gas, with the tube from the tank stuck in his mouth, and claimed that he was saved by a stranger, a woman who noticed his prostrate form as she floated by on the astral plane. Somehow, bodiless, she managed to get the freezing tube out of Mullis’s mouth. Parts of his lips and tongue were frostbitten, but he survived the incident and, years later, met the corporeal version of his savior in a bakery, as if by destiny.
Whatever else he was, though, Mullis was a thinker and a problem-solver. The drive from Berkeley to Anderson Valley took two and a half hours, and it was a time for him to use that gift, to focus on difficult research problems, free from distractions. On this night, he was trying to come up with a quicker genetic test to tell if someone has a disease, such as sickle-cell, tied to a single base pair in their DNA. Somehow, he got to thinking about short sequences of bases—things that Cetus scientists had gotten very good at synthesizing in the lab—and how you could use them to latch onto a part of someone’s DNA that had the complementary string of bases, a GTTCCC in the synthetic sequence, for instance, matching up with a CAAGGG in the person’s genome. From the place of attachment, you could then get the DNA to start replicating itself, creating a new piece of DNA, in the same way that DNA replicates itself when a cell divides. These were known facts, things that could be done.
Like anyone who worked with DNA, Mullis knew that getting a large enough amount of any particular stretch of the genetic material to enable one to sequence it—to read the order of As, Gs, Cs, and Ts—wasn’t easy. The standard procedure was to insert the target DNA into bacteria and then get the bacteria to multiply on petri dishes, replicating the alien DNA along with their own. It was a messy process and far more time-consuming than the sequencing itself. In other words, generating a sufficient quantity of the targeted DNA was the rate-limiting step. Mullis knew that if you could solve this problem by coming up with a simple, fast way to create lots of DNA, it would be a huge methodological breakthrough. It could make DNA sequencing easy.
Suddenly, driving along Highway 128, with the long, pale, flowering heads of the buckeyes drooping down into the beams of his headlights, his mind full of synthetic bits of DNA, he realized how it could be done. It was a classic Eureka! moment, like Wallace, in a malarial fever, flashing on the mechanism of natural selection. Maybe it was Mullis’s tendency to think outside of the box at work. He has said that his use of LSD some years before probably helped, opening up new pathways in his mind. In any case, he could hardly believe his thoughts. He pulled off the highway, grabbed a pencil and an envelope out of the glove compartment, and scribbled down a few notes. A large buckeye loomed over the little Honda. His girlfriend stirred in her sleep. There was a small white highway marker where he had pulled out, mile marker 46.58.
What Mullis had realized was that, by using two different synthetic DNA sequences that would attach to areas that weren’t too far apart on a DNA strand, you could start in motion a process that would duplicate the target DNA—the stretch between the two attachment sites—over and over again. The key was that a new strand that had begun replicating from one of the bits of synthetic DNA would, in the next round of duplication, provide an attachment site for the other bit of synthetic DNA, which would then initiate replication in the other direction. Run the process long enough, round after round of duplication, and you’d end up with astronomical numbers of the target sequence. The first thing Mullis scribbled down was just the progression of increasing amounts of target DNA with each round of duplication. Starting from one molecule, you’d have two molecules after one round, four after another round, then eight, sixteen, and thirty two. Ten rounds would give you 1,024 molecules, and up and up exponentially. It would work, he thought. By the time he had driven another mile down the highway and pulled over again, he was already thinking about his Nobel Prize.
Mullis called his new method the polymerase chain reaction—PCR for short—after the enzyme, DNA polymerase, that catalyzes replication. When he described the new technique to his coworkers at Cetus, including his girlfriend, almost none of them saw that he was onto something groundbreaking. Maybe it was partly because he was known for being a bit flaky. (Did they know about his astral guardian angel?) Mullis himself has suggested that it was also the usual resistance to new ideas, even relatively straightforward ones, that kept them from recognizing the importance of his proposed method. In any case, it was some seven months later, and only after he had mucked around in the lab and generated some promising preliminary results, that many other scientists at Cetus got excited. With others on board, some of them much more careful experimental scientists than Mullis, it was soon apparent that PCR was a viable technique.
Mullis had another big breakthrough when he realized that the process could be streamlined by using a version of the DNA polymerase, the replication enzyme, from a bacterium called Thermus aquaticus that lives in the hot springs of Yellowstone National Park. In PCR, after a short period of DNA replication, the newly formed double strands have to be heated up to break them apart again to allow the next round of replication to take place; the polymerase from T. aquaticus (or Taq, for short), unlike the kind found in most organisms, stays intact when heated, and therefore does not have to be replenished in the test tube after each heating step. Using Taq and a heating block called a thermal cycler, programmed to heat up to break the DNA strands apart and then cool down to begin each period of replication, you could just toss in your ingredients and let the thing run, and after a few hours, a handful of copies of the targeted piece of DNA would turn into millions.
Mullis hadn’t been suffering from delusions of grandeur when he dreamed of a Nobel Prize that night under the flowering buckeyes. He shared the Nobel in Chemistry in 1993. While he was in Stockholm for the awards ceremony, the Swedish police came to his hotel room, responding to reports of a red laser beam, like those used on rifle sights, coming from his window. Sweden has a low crime rate, but someone had been murdered in Stockholm a year or so earlier by a sniper using one of those laser sights, so that red beam was making people nervous. When the police questioned him, Mullis had to confess: he had been playing with a new laser pointer, shining it near passersby on the street below to see how they’d react. It was just Mullis being Mullis.
The transformation that PCR brought about took place very quickly and happened to coincide with the few years when I was actually getting my hands dirty (and irradiated) in a molecular lab. In 1987, as part of my dissertation research, I wanted to construct a phylogenetic tree of garter snakes, so I started working in the lab of an evolutionary biologist named Rick Harrison to get the necessary genetic data. For my first year or so in Rick’s lab, I was screwing around with something called restriction fragment analysis, a standard method at the time. The technique required a long procedure for isolating the DNA from snake mitochondria—my main memory of this step is being afraid I would destroy an extremely fast and expensive ultracentrifuge by improperly balancing the samples in it—and then using enzymes that would cut the DNA where particular sequences of bases (say, GAATTC or AAGCTT) appeared. What you ended up with, after several more steps, wasn’t anything like the full DNA sequence, but instead just bands on a gel that very roughly showed the length, in base pairs, of the cut-up pieces of DNA (called restriction fragments). The process was time-consuming and complex, and, worst of all, it often wasn’t clear how to interpret the results. I guess it was fun in a way; there was some satisfaction in working through an elaborate procedure and seeing those bands of actual DNA appear. It was a little like following a difficult recipe to make a nice soufflé. Like a soufflé, though, it was mostly empty; the technique just wasn’t going to provide enough information about the differences between species to construct a reasonable garter-snake tree.
Luckily for me, in 1985 the description of PCR had been published. Once Mullis and other scientists at Cetus had worked out all the major kinks, it was obvious that the method would be enormously valuable. Among other things, it would be used to diagnose infections and genetic diseases and to analyze tiny DNA samples from crime scenes, from bits of frozen mammoth and Neanderthal tissues, and from feathers of extinct birds lying in museum drawers. Eventually, it would make possible the sequencing of the entire human genome. I don’t remember thinking about any of that at the time. What I knew was that a group of scientists—in the first PCR paper, Mullis’s name was buried in the middle of a list of seven authors—had invented a method that would let someone like me, who had no desire to become immersed in molecular biology, obtain actual DNA sequences.
And it did. A bright fellow grad student named Ben Normark was the first one in the lab to learn PCR, and he taught me all I needed to know over a few days. Even the first step showed the advantage of the technique. Since PCR would selectively find the target sequence in a mitochondrial gene and multiply it, there was no need to separate the mitochondrial from the nuclear DNA.25 No more balancing samples in that damned ultracentrifuge. Now the first step was just isolating all the DNA, the mitochondrial and nuclear components mixed together, from each garter-snake tissue sample, which was a fairly trivial task. After that I’d load the DNA samples into little test tubes along with the Taq polymerase, a bunch of free-floating bases, and some other ingredients, put the tubes into the Perkin-Elmer Cetus thermal cycler, and a few hours later I’d have, relatively speaking, a gargantuan amount of the target stretch of DNA, enough to get the sequences. Even though I understood what was happening in the test tubes, it felt like magic.
Unlike grad students today, who just send their PCR samples to an automated sequencing facility, I had to do the DNA sequencing myself, and that was a pain. I especially remember some unhappy moments with the thin sequencing gels, which tended to fold up on themselves when you were transferring them onto cardboard-like filters; I tossed at least one recalcitrant gel into the trash in disgust, several days’ work down the drain. However, at the end of this messy process, if the gel was cooperative, I had a picture with bands in four rows representing the four bases, from which I could read the actual sequences of As, Gs, Cs, and Ts. This result was far more useful to me than the old restriction fragments. The differences between garter-snake species were completely unambiguous—one species would have an A at a particular site while another species would have a T—and, compared to the restriction fragment data, there were lots of differences, which meant more information that I could use to sort out the evolutionary relationships.
With each new set of sequences, I would run a “tree-finding” program on the lab computer (I didn’t have my own computer—this was really the dark ages), and out would come a network of dots and lines, the program’s best guess for the garter-snake tree. For instance, the program told me that the two slender ribbon snake species (a subset of the garter snakes) were each other’s closest relatives (an expected result), and that the highly aquatic Mexican black-bellied garter snake was part of a group of terrestrial species (somewhat surprising). I have to admit that, examining that tree now, it doesn’t look too good. Compared to the tree from a far more extensive sequencing study that some colleagues and I did more recently, also using PCR, my initial tree was quite flawed: it was wrong in a few places, and just not all that informative overall. But it was a beginning.
I ran my first PCR in 1989, four years after the method was first described in print. In hundreds of labs, the same thing was happening right about then—that’s how fast the technique took off. Suddenly, everyone seemed to be using PCR to “amplify” tiny amounts of DNA into quantities that could be sequenced. In the Harrison lab alone, there were people sequencing crickets, swallowtail butterflies, bighorn sheep, crabs, beetles, and various fishes, all because of PCR. Elsewhere, the work extended to countless other animals along with plants, fungi, protists, and bacteria.
This wasn’t the first big pulse of DNA sequencing work: molecular biologists had started obtaining such sequences routinely in the 1970s, when new sequencing methods were invented. However, those studies usually focused on species that were medically important or were for some reason especially suitable for investigating basic issues in molecular biology, such as how genes work. The study species tended to be things like Drosophila melanogaster (the common laboratory fruit fly), yeast, Escherichia coli, the house mouse, and Homo sapiens. Mullis’s invention obviously helped such research. For our purposes, though, the key point is that PCR made it possible for evolutionary biologists—people who were interested in the diversity of living things, but, typically, had limited inclinations to learn difficult molecular techniques—to get DNA sequences from their favorite groups, whether they happened to be crickets, garter snakes, or maple trees. Thus, what PCR produced was an explosion of this most fundamental and unambiguous kind of genetic data, taken from many branches throughout the tree of life.
At the time I wrote my dissertation, including a chapter on the garter-snake tree, molecular dating analyses definitely were not in vogue. This was almost thirty years after Zuckerkandl and Pauling had first described the molecular clock, and by then it was clear that the pace of genetic change could be quite different in different lineages; the clock didn’t tick at a constant rate. For instance, mammals had a fast clock, sharks an exceptionally slow one. Within mammals, the rodent clock ran especially fast. Because of this inconstancy, the clock idea was considered something of a dinosaur, an idea that had to be jettisoned once the data were in. It was a nice thought, but it didn’t pan out. For my dissertation, I never even considered using the garter-snake DNA sequences to place ages on the branching points in the tree. This attitude was about to change, though. The clock was about to make a comeback.
A BIG PILE OF BOTTLES
In the Nevada ghost town of Rhyolite, not far from a cartoonish sculpture of a miner and his penguin and a more detailed one of the robes but not the bodies of Jesus and the apostles in their Last Supper poses, sits a house made almost entirely of old bottles embedded in mortar. The house is no artistic masterpiece, but it’s striking for the sheer number of bottles involved, reportedly about 30,000 of them. It turns out that in the early 1900s, when Rhyolite was in a gold-mining boom, lumber was scarce in the town, but the place had more than fifty saloons, and thus no shortage of beer and whiskey vessels. I picture some of the townsfolk sitting around at that point, saying, “We’ve got no wood, but look at all these damned bottles. We could make a house out of these things.” I envision these would-be architects a bit less than sober, drinking beers and (somewhat) carefully piling up the empties. In any case, however it came about, the idea of a bottle house seemed reasonable enough that the folk of Rhyolite ended up building three of them, although only the one survives.
I imagine a similar beginning to the explosion of molecular dating studies. Not boomtown rats in the desert this time, but a group of evolutionary biologists sitting around in a brewpub, pint glasses in hand, saying, “Hey, look at all these damned DNA sequences. They’ve gotta be good for something. Maybe we should use them to revive that old molecular clock idea.” This isn’t really how it happened. As far as I know, there was no particular get-together that rekindled interest in the molecular clock. However, it does seem that what led to the explosion was this overwhelming resource, all of these DNA sequences from crickets and crawdads, guinea pigs and alligators, mushrooms and magnolias. Typically, the point of this sequencing had been to figure out how groups are related to each other: Are animals more closely related to plants, or to fungi? (Fungi, unequivocally.) Where do whales fit into the tree of mammals? (Right next to hippos.) But now that these sequences were available, sitting in databases like GenBank that anyone could access with the click of a mouse, the old question about time started insinuating itself into people’s minds. It was a pile of bottles too big to ignore.
For someone interested in molecular dating, the pile of sequence data, in addition to being big, was enticing in ways that other kinds of molecular data were not. For one thing, as mentioned above, sequence differences between species are very clear and countable. You can know that the cytochrome b gene in a black-necked garter snake differs at exactly 92 nucleotide positions from the same gene in a checkered garter snake. In contrast, some other measures of genetic distances between species are far muddier, relying on indirect and sometimes very imprecise estimates of DNA differences. For instance, consider DNA-DNA hybridization, which I remember as a cutting-edge technique in the early 1980s. This method involved mixing the genetic material from two species, say a finch and a crow, so that double-stranded DNA formed in which one strand was from one species and the matching strand was from the other species. You then heated up this “hybrid” DNA and recorded the temperature at which, say, 50 percent of it had broken apart into the original single strands from finch and crow. That temperature was a measure of genetic similarity between the two species, because the closer the base-pair match between the strands of DNA, the stronger they stuck to each other and, therefore, the higher the temperature that was needed to jostle them apart. It was a measure, but it wasn’t a very precise measure. The breaking-apart temperature depended, for instance, on properties of the strands other than just the percentage of matching bases. Moreover, even with samples from the exact same specimens, the results could vary from lab to lab, or even from one trial to the next in the same lab, which was troubling. Some other measures of genetic distance have been similarly muddy. For evolutionary biologists thinking about how they might refine studies using the molecular clock, the perfect precision of DNA sequence differences was a refreshing change. It was a bit like going from a historical record based on hieroglyphics to one that uses a true written language with a large lexicon.
The discreteness of DNA sequence differences—an A in one species, a T in another—coupled with other properties of DNA, also made this kind of data a dream for biologists who wanted to use mathematical models to describe evolution, and this suitability for modeling would turn out to be critical for molecular dating. The modelers liked the discreteness of DNA because it let them treat the evolutionary process as straightforward probabilities—for instance, the probability that an A would switch to a T or a G or a C over some period of time. They were also drawn to DNA because the nucleotide changes could be fairly neatly categorized in ways that reflected differences in how evolution works. For instance, some mutations occur more frequently than others (A to G happens more than A to C or T, for example); some changes in the sequence switch the corresponding amino acids in the resulting protein and others do not; some parts of a gene code for especially critical sections of a protein for which almost any change in the amino acids would be harmful, while other parts code for less critical sections where some amino-acid changes are tolerated. The fact that you can divide up DNA changes into these, and many other, logical categories meant that the modelers had a lot to play with. In essence, DNA sequences gave them something to do that was tractable (a favorite word of modelers), but also satisfyingly complex.
This modeling of DNA changes had actually started before the invention of PCR. What PCR did was provide an enormous number of sequences to which the models, growing more and more complex, could be applied. For molecular dating, the importance of the models was that they provided improved estimates of the actual amount of genetic change separating two species, or, more generally, the amount of change along the branches of the evolutionary tree. This may sound a little counterintuitive; you might think that the amount of evolutionary change separating two species is just the observed number of nucleotide differences between them, like those 92 differences in the cytochrome b gene between black-necked and checkered garter snakes. However, this is not necessarily the case. In particular, if nucleotide substitutions (the preferred term among biologists) are happening frequently enough, there will be some sites in the sequence that have switched more than once along the evolutionary path connecting two lineages. Changes will have piled on top of other changes, so to speak. For any one site, you can never know for sure just how many substitutions have occurred—if one species has an A and the other a T, that could mean that their ancestor had an A that changed to a T in one lineage (one substitution), or that the ancestor had an A that went to G then back to A in one lineage and to C and then to T in the other (four substitutions), or, theoretically, an infinite number of other possibilities. You would know there was at least one change, but you couldn’t rule out the possibility of so-called multiple hits at the site. It’s like running into a friend you haven’t seen for many years, who previously had brown hair but now has green hair, and wondering whether he went straight from brown to green or took some more circuitous hair-coloring route, such as brown to blonde to orange to pink to green.
A critical thing that the models do is make an educated guess—actually a calculation based on probabilities—at how many extra “hits” have occurred, changes that are hidden from direct observation, like your friend’s possible multiple hair-color switches. In many cases, especially when dealing with distantly related species, that number can be substantial; in fact, the inferred number of hidden substitutions for a given sequence can be larger than the observed number of differences between species. The models are almost certainly right about the existence of many hidden hits, which, in turn, means that they are giving much better estimates of the actual amount of change along branches in the evolutionary tree than a simple tally of the differences between species. And, almost certainly, the estimates are getting better and better as the models become more realistic, taking into account more of the actual complexity of evolution. All of this is desirable for molecular dating studies—it’s giving us a much better handle on half of the equation for relating genetic change to time.
Today, the Rhyolite bottle house sits locked and empty. Tourists, most of them on their way into or out of Death Valley, circle it, snapping photos under the baking desert sun. However, the house wasn’t built just as a curiosity; it actually functioned as a dwelling for a few years and, later, as a trinket shop. All those beer and whiskey bottles really were good for something (apart from holding beer and whiskey). The same might be said for the unexpected information that came out of the big pile of DNA sequences, the outcome of Kary Mullis’s insight (and Watson and Crick’s discovery of the structure of DNA, and Fred Sanger’s invention of an efficient DNA sequencing method, and the work of countless others). The pile of sequences and the evolutionary models that turned those sequences into amounts of genetic change generated a flood of molecular dating studies. For biogeography, these studies have been critical, providing evidence of timing, of what happened when. In my view, this evidence has been the key to extricating biogeography from the intellectual cul-de-sac created by the vicariance school.
However, as noted earlier, many scientists still have doubts about the validity of molecular clock analyses, and not all of these people are hard-core vicariance advocates. For instance, a friend of mine, an evolutionary biologist who I think of as both moderate and reasonable, refers to such analyses succinctly as “bullshit.” As the foundation for a new view of biogeography, “bullshit” doesn’t really work. Before we go on, then, we need to confront the criticisms of molecular dating. We need to establish that, in the results of these analyses, we’re dealing with a functional bottle house, not a dangerous pile of broken glass.
22 At one time, the age of the universe estimated from its apparent rate of expansion was considerably younger than the estimated age of the Earth, indicating that something was seriously amiss with at least one of these estimates. Present knowledge suggests that, in fact, both estimates were too young at the time, but the estimate of the age of the universe was far too young.
23 The chronicle tends to be more difficult to reconstruct as one delves into the more distant past. However, problems of an uncertain chronicle also can arise for the very recent past, as in a criminal case in which the whereabouts of a person at a particular time are critical, but hard to establish. That sort of example illustrates that what is often important is the accurate placement of an event in time relative to other events.
24 The early molecular clock studies were based on amino-acid sequences in proteins rather than on base-pair sequences in DNA.
25 A complication is that many organisms, including humans, have nonfunctional sequences derived from mitochondrial DNA that have been incorporated into the nuclear genome. When that is the case, using PCR on DNA samples that include both the mitochondrial and nuclear genome will often generate sequences from both the targeted mitochondrial DNA and the untargeted nuclear copies corresponding to the same segment. If, as is often true, the nuclear copies are quite different from the original mitochondrial DNA, this phenomenon can strongly distort the results.