Ancient Genetics to Ancient Genomics
NEXT-GENERATION SEQUENCING
In 2005, 454 Life Sciences Corporation—a biotechnology company in Branford, Connecticut—announced the innovation of next-generation sequencing (NGS).1 Created in part by Jonathan Rothberg, the founder of the company, along with several dozen researchers, NGS was the result of years of work dedicated to developing a more time-efficient and cost-effective DNA sequencing technology. The rise of large-scale sequencing projects, including whole-genome sequencing, and the need for a machine that could automate the workload motivated their pursuit. The Human Genome Project, which was initiated and completed just before the advent of NGS, offers a case in point. That project, begun in 1990 and finished in 2003, was an unprecedented international endeavor involving thousands of scientists, over a decade of research, and just under $3 billion to determine the order of nearly 3 billion base pairs of DNA that together make up the human genome.
In contrast, NGS made whole genome sequencing much easier by dramatically increasing the speed of production while decreasing the total cost. Indeed, the beauty of this state-of-the-art technology was in its unmatched throughput. NGS could produce close to one billion sequences in a single run over the course of just a few days. With this technology, it was possible to efficiently sequence an entire genome, a process that includes identifying all the DNA that makes up an organism and determining the exact order of all the bases or letters (A, G, T, and C) in each strand of DNA. In 2007, for example, scientists sequenced the genome of James Watson—one of the co-discoverers of the helical structure of DNA—in a mere two months and for less than $1 million, “a 1,000-fold improvement over the cost of the decade-long Human Genome Project.”2 Practitioners have called NGS nothing less than a “paradigm shift” into a new era of technological innovation and scientific possibility.3
NGS has been used as a broad term to characterize a number of high-throughput sequencing technologies, a variety of machines that use parallel platforms to sequence more than 1 million short reads of DNA (50–400 base pairs) at one time. The technology has had a profound impact on the field of genomics more generally, revolutionizing both the scale and scope of research that is possible. Although NGS was not developed to aid the search for DNA from fossils, researchers quickly recognized its benefits and potential to transform their own work. There were several platforms available (varying in chemistry and technology), but two instruments in particular became widely used in ancient DNA research during this period: Roche (454) GS FLX and Illumina (Solexa) Genome Analyzer.4 Because ancient DNA is often damaged and fragmented, researchers have been constrained by short DNA sequences. However, NGS favors short sequences. Thus, what was once a disadvantage for ancient DNA research has now become an advantage. Overall, the technology has enabled practitioners to generate a much higher quantity and quality of data in a fraction of the time and cost when compared to previous sequencing approaches.
Most obviously, the difference between the pre-NGS and post-NGS era was the ability to generate a handful of sequences as opposed to billions of sequences. Previously, the study of ancient DNA, because of its degraded and damaged state, had been limited to the study of mitochondrial DNA and occasionally nuclear DNA. Mitochondrial DNA is the most accessible form because of its abundance in animal and plant cells, which means there is a higher likelihood that at least some genetic material would be preserved and could then be extracted. Mitochondrial DNA, inherited from the maternal line, is informative but offers only a partial picture of an organism’s genetic history. However, mitochondrial DNA in combination with nuclear DNA, inherited on the paternal line, offers a more complete picture. With NGS, it became possible to easily sequence any and all DNA from a sample, thus making it theoretically possible to recover the entire genome of long-dead creatures at nearly the same rate, as well as cost, of that of living species.
Two separate studies—issued just six months apart—exemplified the drastic impact that NGS could have on the field of ancient DNA research in particular. In the first study, published in 2005 by Science, researchers recovered nearly 27,000 base pairs of ancient genomic data from two 40,000-year-old cave bears. The study—by James P. Noonan and Eddy Rubin (both at the U.S. Department of Energy Joint Genome Institute and Lawrence Berkeley National Laboratory in California) and conducted in collaboration with colleagues including Michael Hofreiter and Svante Pääbo at the MPIEVA in Leipzig—used a direct-cloning technique that did not involve the traditional PCR amplification of targeted sequences as had previously been common in the field of ancient DNA research. The technique they developed was different and intended to circumvent issues associated with PCR while generating a higher quantity and quality of DNA from their specimen of study. With this technique, they were able to sequence more than just the genome of the ancient cave bears. They were able to sequence its metagenome, meaning the collective genomic sequences of all the organisms associated with a single sample. Here, the metagenome was a mixture of the ancient DNA sequences of the organism of interest, in this instance the cave bear, plus any DNA sequences from other organisms and the external environment that had come in contact with the specimen. At the time of publication in 2005, their results represented the largest data set of ancient DNA sequences from an extinct species, and the study overall showcased the potential to access the entire genome of ancient organisms.5
This feat, however, was overshadowed less than a year later. In a second study, also published by Science, Hendrik Poinar—recently appointed head of the Ancient DNA Centre at McMaster University in Ontario, Canada—and colleagues capitalized on the recent availability and advantages of NGS to successfully sequence 13 million base pairs of ancient genomic data from a 28,000-year-old woolly mammoth. To generate this amount of data, the team used a technique referred to as shotgun sequencing. In the pre-NGS era, PCR and Sanger sequencing were used to target a specific DNA sequence. In contrast, NGS (in combination with shotgun sequencing) made it possible to sequence all the available DNA in a sample. In this process, DNA is randomly broken into numerous short overlapping strands, then cloned, sequenced, and finally reassembled. In this case, when the sequences were reassembled, scientists found that the mammoth remains contained much more than just mammoth DNA. Scientists sequenced a total of 28 million base pairs of DNA of which 13 million base pairs were identified as authentically ancient mammoth DNA. The remaining data, approximately 15 million base pairs, was environmental, bacterial, or unidentified DNA.6 The difference between the two papers in terms of data output was clear. The recovery of 13 million base pairs of genomic data from the woolly mammoth, compared with the approximately 27,000 base pairs of genomic data from the extinct cave bear, was an impressive 480 times increase in yield.7 “The change was massive,” said one scientist, “absolutely massive” (Interviewee 15).
In a review paper, Alan Cooper—formerly at Oxford and at the time recently appointed to the Australian Centre for Ancient DNA at the University of Adelaide—emphasized how NGS could affect the field. To make his point, Cooper spotlighted three near back-to-back articles, published within six weeks of each other, all claiming the independent and near complete sequencing of the mammoth genome. For Cooper, this series of studies conveniently captured what he saw as the past, present, and future of ancient DNA research in regard to its technological potential.8 Although each study accomplished similar achievements, they did so through distinctly different techniques. The first paper, by geneticist Evgeny I. Rogaev and colleagues, used PCR, while the second paper, by Johannes Krause and co-authors, used a multiplexing method, a variation of PCR that simultaneously amplifies multiple targets, as opposed to just one.9 The third was, of course, Poinar and colleagues’ landmark paper reporting the recovery of the mammoth metagenome—nearly 28 million base pairs of genomic sequences—in just one experiment by using the new high-throughput sequencing technology developed by 454 Life Sciences.10 For Cooper, the first two represented the past and present state of the field, respectively, while the third provided the possibility for a technical transition into the future. Collectively, these papers and the diverse methods employed provided a conceptual and technological snapshot of the discipline’s history and conceivable future. “This is an exciting time,” wrote Cooper, “as the opportunities by the new parallel sequencing system will allow researchers to contemplate large-scale studies of ancient genomes, and promise to finally release the full potential of a[ncient] DNA to reveal evolution in action.”11 Given the potential, a number of researchers in the ancient DNA community eagerly embraced the new technology.
For this community specifically, NGS offered an opportunity to overcome some of their most persistent technological challenges, namely the low quantity of data and the problem of contamination. To be clear, NGS did not remove the possibility of contamination, but it did reframe the problem. Ancient samples indeed had contaminating sequences, but practitioners were able to calculate the amount of contamination, permitting them to increase confidence in DNA authenticity. They did this by reading the sequence data and looking for molecular signatures of chemical degradation, usually postmortem damage (changes that occurred in the organism after death) that were characteristic of authentically ancient DNA. Further sophisticated computational techniques were also needed to estimate the amount of contamination. The availability of more data thanks to NGS, combined with scientists’ ability to recognize and analyze patterns indicative of DNA degradation, made it possible for researchers to determine which sequences were from the organism itself and which belonged to the external environment. Although lab and specimen handling protocols remained stringent (in order to prevent further unnecessary contamination), ancient DNA researchers could, in a sense, rest in the fact that they could actually estimate amounts of contamination. “So, now it’s not only a question of having controls,” explained an interviewee. “You can actually look at your data and determine whether you have a contamination problem or not, right?” (Interviewee 7). According to a second scientist, the somewhat newfound freedom from contamination concerns was of great consequence for the research community: “I remember fighting at conferences [about] if these sequences [were] feasible or not; if it was contamination or not.” But with NGS, the focus and fighting over results had changed: “This is not really an issue anymore because people have contamination, but they calculate it away. [Laughs]” (Interviewee 13).
The newfound ability to more easily differentiate between authentically ancient DNA and contemporary contaminating DNA was a major advantage of NGS. Indeed, the community’s near decade-long debate over the cause of the Black Death, as evidenced through ancient genetic data, was one of the most salient examples of how NGS would change the discipline in terms of contamination concerns. The Black Death—one of the most destructive pandemics in human history—killed millions of people across Europe in the mid-fourteenth century. Although there was much historical speculation about its biological cause, definitive evidence had yet to be found. In 2000, a team of scientists—including Didier Raoult and Michel Drancourt from the University of the Mediterranean in Marseille, France—tried to answer the question with ancient DNA. The team sampled human remains from a mass burial site in the south of France. From these remains, they recovered DNA sequences for a particular bacterium, Yersinia pestis. According to their paper, published in the Proceedings of the National Academy of Sciences, they had found the cause of the Black Death, ultimately solving a six-hundred-year-old mystery.12
At this time, however, five years before the invention of NGS, the field was still at the height of controversy regarding contamination. It was not long until another team challenged Raoult and colleagues’ conclusions. A group led by Thomas Gilbert—a postdoctoral researcher working with Cooper, who was then at Oxford—attempted to extract and identify the bacteria from more than a hundred samples taken from mass burial pits across Europe dating to the plague. Despite their seemingly comprehensive sampling, they failed to replicate positive results.13 One researcher recalled the debate that ensued: “So, you had Didier saying, ‘We found it!’ And then Tom would say, ‘You didn’t find it!’ ‘We found it!’ ‘You didn’t find it!’ And there’s probably ten years of publications going back and forth about this” (Interviewee 27).
Several years later, another group—independent of either side of the debate—gathered its own samples to be sequenced using the newfound availability of NGS.14 Kirsten Bos, a doctoral student supervised by Hendrik Poinar in Ontario, collaborated with colleagues to sample DNA from teeth found in a plague pit in central London. In combination with NGS, they applied a specialty technique—targeted capture—that allowed them to pinpoint sequences of interest while leaving behind other genetic material. Krause, another author on the publication and now a professor at the University of Tübingen, had learned this technique while a graduate student in Pääbo’s lab in Leipzig. Not only did they obtain evidence for the bacteria, Yersinia pestis, but they were able to sequence the genome. In doing so, their study settled the dispute: Yersinia pestis was the cause of the Black Death. One researcher remembered the mark this case left on the field: “Once they had the whole genome then there was no question, right? It completely ended the debate. . . . It dropped like a bomb on the community—like a huge bomb. [Laughs]” For scientists at the time, it was a pivotal moment. “It was also one of the earliest demonstrations of [how] next-generation sequencing is going to completely change this game,” recalled this researcher. “We’re in a different era, and it just shut that whole thing down” (Interviewee 27).
However, in the years between Cooper and Poinar’s article in 2000 and the innovation of NGS in 2005, ancient DNA researchers were not waiting idly for the next best technology to come along. A handful of practitioners were persistent in their search for DNA from fossils, seeking—despite the odds—to find more efficient approaches to recovering greater quantities of and better quality genetic material from ancient specimens.15 Researchers realized that a handful of sequences from one or two specimens was of limited value. They needed more fossils, more DNA. They needed confidence of authenticity too. This shift signified a growing awareness among researchers that if the search for DNA from fossils was to become relevant to the evolutionary biology community, they would need to answer questions on the population level, not just the individual level. In an effort to achieve this, some scientists set out to recover and analyze DNA from a wide range of ancient samples.
One of the earliest examples exhibiting a move in this direction was a paper by Jennifer A. Leonard, Robert K. Wayne, and Alan Cooper in which they successfully obtained sequences from seven permafrost-preserved brown bear specimens dating back to the Ice Age.16 With this data, they demonstrated that the present distribution of brown bears in terms of demography and geography was distinctly genetically different from its past. Ancient DNA showed a side of the story that was indiscernible with inferences from modern genetic material alone. One scholar observed, “The big change in terms of moving into population genetics—away from phylogenetics—[was] to provide population genetics with a time scale which it had never had before.” According to this interviewee, the paper presented a “big conceptual breakthrough” (Interviewee 32). Another early example of this transition was a study led by Beth Shapiro, a doctoral researcher with Cooper at Oxford.17 This study was conceptually and technologically important for its large number of samples, use of statistical demographic modeling, and conclusions that contested previous assumptions about bison evolution and extinction. Overall, this study—among others similar to it—demonstrated the potential to obtain larger amounts of ancient genetic data in order to test hypotheses about the evolution and extinction of past populations, as well as its impact on conservation biology and understandings of climate change.18
NGS was in part so important to the field because it presented practition--ers with a chance to much more easily and efficiently transform the search for DNA from fossils into a credible practice capable of addressing bigger questions within evolutionary biology. Given its advantages, NGS surpassed Sanger sequencing, the primary sequencing technology since the late 1970s, and it soon overshadowed PCR, the once state-of-the-art technique that sparked the field of ancient DNA research into existence but now held it back in light of technical limitations. “I used to joke that I was a retired ancient DNA researcher,” said one scientist, “but then the big game changer—without a shadow of a doubt—has been ultra-high-throughput sequencing or next-generation sequencing. And it has completely rescued the field.” Comparing the eras of PCR versus NGS, this same scientist emphasized the variance in data output between the technological paradigms: “PCR allowed ancient genes, NGS has allowed ancient genomes” (Interviewee 21).
NEANDERTHAL GENOME
In July 2006—shortly after the introduction of NGS and its initial application to a number of ancient DNA studies—Pääbo with the MPIEVA, along with 454 Life Sciences, announced they would be the first to attempt to sequence the entire Neanderthal genome. They planned to sequence the first genome of our extinct and archaic ancestor, and they would do it in just two years’ time. All of this was announced for the first time through a highly orchestrated press conference and press release.19 In his memoir, Pääbo remembered the press conference as an “electrifying event”; the “room was full of journalists,” and “media from across the globe” tuned in online.20 It seemed that everyone wanted to hear what scientists and the public had once thought impossible. From the outset, the Neanderthal Genome Project was a substantial media production. Furthermore, what was broadcast as a big event was going to be an even bigger effort. The fact that an accomplished yet careful and conservative practitioner such as Pääbo would both initiate and publicly advertise a venture of this magnitude was evidence of his confidence in NGS to help them deliver this extraordinary achievement.
The thrill of such an unprecedented endeavor was followed by the stress of work to be done in a short time span. Pääbo was acutely aware of the pressure he had placed on himself and his lab. “Now I had really stuck my neck out,” he noted in his memoir, “publicly promising to sequence the Neanderthal genome.” Indeed, the stakes were high: “If we succeeded, it would clearly be my biggest achievement to date; but if we failed, it would be a very public embarrassment, almost surely a career-ending one.” He admitted that succeeding would not be as easy as he made it sound. Indeed, a mere two months before the project’s official announcement, Pääbo had presented his plans to fellow scientists at the Cold Spring Harbor Laboratory’s Annual Symposium on Genome Biology. At that time, he and colleagues had just sequenced nearly 1 million base pairs of Neanderthal DNA. But a million base pairs, although a real feat, was far from what they needed to eventually reconstruct the entire genome, which would include almost 3 billion base pairs. The present data represented only 0.0003 percent of the whole genome.21 Nonetheless, he claimed that it could, and would, be done. As far as Pääbo was concerned, in principle it was possible.
Pääbo’s colleagues and collaborators felt the gravity of the situation too. Not only was the project a technical and financial challenge but the media attention magnified the pressure to perform and perform well. “The pressure we had,” explained one interviewee involved with the project, “was a self-inflicted pressure that Svante had created by announcing that we would publish the genome in two . . . years or something crazy.” Pääbo’s lab was a powerhouse institution in the field, but even with its technical expertise and financial access, it was underequipped to reach the goal. “We didn’t even have the material to do it,” recalled the same scientist. When the project was proposed, its attainment really rested on the idea that there would be technical and methodological improvements, hopefully sooner than later (Interviewee 12). They needed more money, more machines, well-developed techniques, and, most importantly, well-preserved fossils with Neanderthal DNA.
The Neanderthal Genome Project, although a unique effort on its own, was not an isolated idea. Rather, it was the product of major technological advances coupled with widespread interest in whole genome sequencing projects.22 The Human Genome Project, for example, was a herculean effort requiring an exceptional amount of talent, money, technology, and resources. Leading up to its launch and throughout its duration, the project was advertised by scientists, reporters, and politicians alike as a “Holy Grail” for understanding life itself.23 Additionally, the Neanderthal Genome Project was also the product of various scientific, conceptual, and technical developments seeking to study the evolution and extinction of Neanderthals through DNA.24 After Pääbo’s lab at the University of Munich and Mark Stoneking’s lab at Pennsylvania State University were the first to successfully sequence Neanderthal DNA, they explained how they found no evidence that Neanderthals and our ancient human ancestors had interbred thousands of years ago—but they also noted that this conclusion could not be definitively determined by mitochondrial DNA alone; they needed genomic data and lots of it.25
NGS offered an opportunity to sequence a higher quantity of Neanderthal DNA with the possibility of yielding better quality data too at a much lower cost in terms of time and fossil material. Initially, Pääbo and Eddy Rubin—a biophysicist turned geneticist at Berkeley—agreed to collaborate since they had recently worked together on the sequencing of several thousand base pairs of ancient genomic data from forty-thousand-year-old fossil cave bear remains. For this collaboration, however, they would be working on a fossil species that was much rarer. The MPIEVA sent Berkeley an extract from a 38,000-year-old Neanderthal fossil that came from the Vindija Cave in northern Croatia with the intention that their respective labs would attempt to sequence authentic Neanderthal DNA. From the outset, however, Pääbo and Rubin disagreed on exactly how they would go about extracting and sequencing the DNA. Rubin was set on indirect sequencing via recent advances in traditional bacterial cloning methods, while Pääbo insisted on direct sequencing via NGS, arguing that this approach would permit them to sequence more DNA using less fossil material. At first, they agreed to disagree. Rubin’s lab employed its indirect sequencing approach, recovering 36,000 base pairs of Neanderthal DNA. Meanwhile, Pääbo’s lab used its direct sequencing approach with NGS and recovered nearly 750,000 base pairs of Neanderthal DNA.26 Neither lab had yet sequenced the whole genome, but they succeeded in recovering a partial draft and planned to publish the results in anticipation of more to come.
Soon, however, Rubin’s and Pääbo’s differences in their approaches turned to discord. Not only had Rubin and Pääbo implemented very different methods but their respective methods resulted in a sizeable difference in the amount of data obtained. It became clear to both that they would have to publish separately.27 On November 16, 2006, Nature published the MPIEVA’s research led by Richard E. Green—a bioinformatician and recent postdoctoral researcher in Pääbo’s lab. The next day, Science published Berkeley’s findings, led by James P. Noonan in Rubin’s lab. As back-to-back publications in world-renowned journals, the disparities between the papers were obvious. Not only did each group use different methods but they also arrived at clearly different conclusions. Data from Rubin’s lab provided no evidence for the genetic contribution of Neanderthal DNA to modern humans. Conversely, results from Pääbo’s lab suggested a significant amount of admixture between the two. “The conclusions of the studies are pretty much completely opposite,” recalled a researcher. “One of them says there’s no mixing with modern humans, one says there’s a lot of mixing with modern humans. And the weird thing is they both analyzed the same bone. So, it wasn’t even two different Neanderthals” (Interviewee 6). Soon after publication, a team of different researchers—independent of either lab—reanalyzed both data sets in light of their contrasting conclusions. In the end, they found evidence of contamination in the Neanderthal DNA sequenced in Pääbo’s lab. Specifically, they found evidence of modern human DNA, which explained why that lab’s conclusions supported admixture between Neanderthal and modern humans, thus conflicting with the data and conclusions from Rubin’s lab.28
According to Pääbo’s memoir, he and his lab in Leipzig had worried about contamination in their own findings, so much that they considered rewriting, even retracting, their paper awaiting publication in Nature. To be sure of their findings, they sent their data to Rubin’s lab for comparison. Indeed, Rubin’s lab confirmed that Pääbo’s lab had a level of contamination in their results based on differences in sequences. As far as Pääbo was concerned, the differences could be from bacterial contamination or even the result of genetic mutations. In response, Pääbo’s lab frantically sequenced and analyzed the results again and was able to measure the likelihood of contamination by comparing a set of Neanderthal DNA sequences with those of modern humans. Based on the fragments they reanalyzed, they determined that they indeed had recovered authentic sequences from the Neanderthal specimen under study and that the level of contamination was low. Accordingly, they reasoned that the differences in sequences their lab, as well as Rubin’s lab, had seen were perhaps the result of other unknown factors and not direct evidence of contamination. Thus, Pääbo’s lab decided to publish anyway.29 They would get these new results out and analyze the anomalies later.30
The community of ancient DNA researchers heavily debated the publications and their differing conclusions. Not only did one of the publications appear to suffer from problems of contamination but the lead on this study was Svante Pääbo, a symbol of conservatism regarding the risk of contamination in ancient DNA studies. Over the years, he had made a name for himself and his lab against contamination. Now it appeared that he and his lab had published results with knowledge of contamination, or at least knowledge of possible contamination. “There are rumors even that he submitted it knowing it was contaminated,” recalled one researcher. “And one of the supporting statements for that is the fact that Eddy Rubin is not on Svante’s paper but that Svante is on Eddy’s paper, suggesting that Eddy withdrew himself from Svante’s paper because he knew there was something wrong with it. . . . But the interesting thing is that Svante never published an errata on that and it’s kind of weird given it’s the standard behavior” (Interviewee 6).
Pääbo himself noted the tension, especially in his collaboration with Rubin. Pääbo and Rubin had previously disagreed on the approaches they would use to sequence the Neanderthal genome, and after they published their papers, it became quite clear (at least to Pääbo) that if Rubin was not going to collaborate with him toward the Neanderthal genome, he would compete for it. According to Pääbo, Rubin was after the same Neanderthal bones, from the same individuals or institutions that they had both worked with together for years.31 In his memoir, Pääbo quoted Rubin in an interview with a media reporter from Wired, which had Rubin saying: “I need to get more bone. . . . I’ll go to Russia with a pillowcase and an envelope full of euros and meet with guys who have big shoulder pads. Whatever it takes.”32 Motivated by fear that Rubin would publish the entire genome first, Pääbo impressed on his lab the need to complete the project as soon as possible.33
In 2010, a decade after the first Neanderthal sequences were recovered and four years following the initial announcement of the Neanderthal Genome Project, the MPIEVA at long last published a first complete draft of the Neanderthal genome.34 They were first to the finish line. The project, conducted by over fifty scientists at a cost of approximately 5 million Euros, successfully sequenced more than 4 billion base pairs of Neanderthal DNA obtained from three different individuals.35 What made the project and the paper so impressive, however, was the scientists’ analysis of the data and their interpretation of its implications for understanding human evolution. Pääbo enlisted David Reich, a population geneticist from Harvard University, to help make sense of all the data. Indeed, Reich played a lead role in the project’s overall success. Data alone would not be enough; the tools for its analysis were a critical component of the project. It was the combination of genomic data generated by Pääbo’s lab and statistical methods developed by Reich’s lab that allowed them to detect signals of admixture between humans and Neanderthals. In other words, the data analysis suggested clear and extensive evidence that early humans had interbred with their archaic ancestors the Neanderthals before they went extinct nearly forty thousand years ago.
Crucially, the evidence for admixture seemed to suggest that Neanderthals only interbred with a particular human population, those early peoples who had traveled out of Africa into Europe. By comparing the Neanderthal genome with modern human genomes across the world, scientists determined that Neanderthals shared more similarities with present-day non-African populations than with present-day African populations. Neanderthal DNA existed in a small percentage (1–4%) of a specific population (Eurasian population). In other words, humans today of European or Asian descent, but not African descent, have bits of Neanderthal DNA in their own DNA. “The next time you’re tempted to call someone a Neanderthal,” reported National Geographic, “you might want to take a look in the mirror.”36 Sure enough, some humans may have more in common with our extinct Neanderthal cousins than previously imagined.
Pääbo expected the Neanderthal Genome Project and the announcement of its findings to have a significant impact on the archeological and anthropological communities, but he claimed, however naively, not to have anticipated the public reaction. Indeed, the surprising conclusion about interbreeding with Neanderthals generated a massive amount of media attention. According to Pääbo’s memoir, his paper published in Science, for example, attracted attention from the creationist community, a conservative fundamentalist religious group in the United States, who reinterpreted the results as evidence in favor of their own views about Neanderthals’ relations to humans and creation.37 Several random women wrote to Pääbo speculating that their own husbands were in fact living, breathing Neanderthals in the modern age. Playboy even spotlighted the research in a four-page spread titled, “Neanderthal Love: Would You Sleep with This Woman?”38 These reactions were hardly surprising given the enduring public interest in the science of ancient DNA research and the study of human evolution. The Neanderthal Genome Project was packaged, pitched, and even intentionally pursued within this context, and with an awareness of its scientific significance, as well as its news value.
GENOME REVOLUTION
The search for ancient genomes, facilitated by the technical convenience of NGS and its ability to produce genomic data quickly and relatively cheaply, has ushered in a race among researchers to be the first to sequence whole genomes from a variety of specimens including ancient plants, animals, and diseases. Some researchers have also set their sights on recovering genomic information from ancient humans including Paleo-Eskimos, Aboriginal Australians, and famous historical figures like King Richard III.39 In the search for ancient human genomes, scientists have used this data to shed light on the behavior of our early ancestors, including Mesolithic and Neolithic hunter-gatherers, while also exploring transformations in human cultural practices such as milk consumption, which have directly impacted our evolution in terms of selection for lactase persistence.40 Much work has examined our interactions with animals through time by interrogating genetic signals for domestication in pigs, cattle, and dogs on large global and temporal scales.41 Using NGS, scientists are seeking to reach farther back in time to learn more about our extinct and archaic ancestor the Neanderthal. With this genome-wide data, scientists were able to estimate the extent to which early humans and Neanderthals had interbred before the latter’s extinction.42 Adding to the excitement, practitioners sequenced the first genomic data from a Denisovan, a formerly unknown extinct hominin species whose identity as a distinct archaic human species was uniquely obtained from DNA extracted from a small finger bone, as no extensive fossil record exists.43 According to scientists and journalists, these works—among others—represent a revolution in our understanding of human history in terms of our origin, evolution, and migration across the globe.44
For some scientists, this adaptation of high-throughput sequencing technologies to the search for DNA from fossils, and the massive amount of data that could be produced from it, as well as the resulting grandiose conclusions that researchers could draw from it, suggested an overcoming of previous limitations and a maturation of the field. At the same time, however, the field seemed to be coming full circle, back to an era of exploration and hype. This race for the first or the oldest genome (as well as the race to sequence the most genomes), and the accompanying media attention surrounding these high-profile publications in Nature and Science, shared striking similarities to the search in the 1990s for the first or oldest DNA. In one way, this hype took form through scientists’ newfound confidence not only in the technology of NGS to generate a higher quality and quantity of genomic data from ancient specimens but also in their ability to overcome previous contamination concerns. In another way, hype also took form through scientists’—as well as media reporters’—projections that the field had come of age and into a new role as an authority on human evolutionary history as told through DNA. As ancient DNA researchers explored the potential afforded by next-generation sequencing technologies, these two kinds of hype colored the direction of the discipline and the public’s perception of it.
Although the introduction of NGS to ancient DNA research did not wholly sweep away the criteria of authenticity as it related to PCR and the decades of debate around the issue, its incorporation into the field did bring a fundamental restructuring of the practice regarding the research questions scientists could ask and the types of resources needed to answer them. There were three specific ways in which NGS altered the nature of the search for DNA from fossils. First, its obvious utility, and scientists’ ability to adapt it to their unique pursuit for DNA from fossils, changed the field in terms of scale and scope of data production.45 High-throughput sequencing technologies could produce an astounding amount of genomic information that required both large amounts of data storage and newfound skills in data analysis. “Processing is completely different because before I could still look at each sequence by eye and edit them by hand, but now we have . . . billions of sequences and you have to do everything by bioinformatics,” said an evolutionary biologist. “So, that has changed completely” (Interviewee 15).
Second, this massive increase in data required researchers to learn or seek specialized mathematical, statistical, and computational skills in order to analyze the data and answer questions about evolutionary history. According to another evolutionary biologist, “It’s the people who are going to analyze it all that are going to end up with all the work and all the fame and fortune” (Interviewee 25).
Finally, all of this changed the conversation around contamination. Indeed, this shift from ancient genetics to ancient genomics moved the debate from one about data contamination to a focus on data production: “At the moment, we are not discussing the authenticity of the results much anymore,” confided one paleogeneticist. “At the moment, we are rather discussing the correct filters that you have to apply to your data set and how to handle these huge amounts of data” (Interviewee 13). According to practitioners, the increased ability to sequence genomes rapidly superseded their aptitude to analyze the data. It even sometimes superseded the questions that they could ask of the data. One interviewee, for example, observed, “People are going over the top because they can—just sequencing the living crap out of absolutely everything. So, we’re in this kind of exploration phase again, where it’s like, ‘Grab as much data as you possibly can, hire a great bioinformaticist, and then start asking questions in the resulting data sets’ ” (Interviewee 22). Together, these changes suggested that scientists found themselves facing a new phase of exploration.
However, the transition from the PCR to the NGS era was not easy. Doing so required both extensive expertise in genetics and bioinformatics and substantial financial resources for sequencing equipment. Even scientists who made the move felt the difficulty in doing so. “It took us a few years, and we’re a genetics department,” an interviewee explained. “Whereas if someone is in an anthropology or archaeology department, it’s quite a different story. It’s become, I think, impossible for somebody to transfer from an archaeology or anthropology discipline to this field” (Interviewee 21). As a result, many labs were left behind while others forged ahead. “The kits are expensive, the primers are expensive, and it’s all very new,” remarked an archeogeneticist. “It was really scary to a lot of labs, and a lot of labs haven’t made that transition because it’s expensive and it involves the development of a completely new tool set” (Interviewee 27). A lab’s decision to transition to NGS-based methodologies was a serious commitment because it was a big intellectual and financial risk.
Nonetheless, some labs made the move successfully, and a select handful made it to the top. Svante Pääbo’s lab at the MPIEVA in Leipzig has been one of them. Eske Willerslev, a former postdoctoral researcher with Alan Cooper, at the University of Copenhagen in Denmark is another. In Copenhagen, research has been further bolstered through the work of two intensely productive labs, led respectively by Thomas Gilbert and Ludovic Orlando. Together with Willerslev’s lab they make up the Center for GeoGenetics. Indeed, the center’s research output, coupled with Willerslev’s media-savvy personality, have made these labs internationally famous. Additionally, David Reich—geneticist and collaborator with Pääbo on the Neanderthal Genome Project—developed his own ancient DNA lab at Harvard University in Cambridge, Massachusetts. Although a much more recent recruit, he quickly became a powerful researcher, even competitor, in the field. “Some labs have struck way ahead,” one interviewee said. “You know who they are. They’re Leipzig, Copenhagen, and Harvard. They’re the big productive labs” (Interviewee 21). More recently, Johannes Krause—a former doctoral student of Pääbo’s—has emerged as a researcher at the forefront of the field from his recently appointed position as director of archaeogenetics at the Max Planck Institute for the Science of Human History in Jena, Germany.
A shared feature among these labs was the serious financial and institutional support they enjoyed, which enabled them to attract international talent and deliver large-scale, high-impact research. As a consequence, these labs also enjoyed good rapport with leading scientific journals, from Science and Nature to Cell and Proceedings of the National Academy of Sciences of the United States of America (PNAS), which in turn brought more prestige, along with further access to money and fossil samples. In the process, the heads of these labs became well-accustomed to the media spotlight, having been extensively interviewed and profiled by global media outlets and having established their labs as scientific powerhouses in the field of ancient DNA research, particularly in the study of human evolutionary history.46 More than that, the work coming out of these labs has claimed to entirely rewrite our understanding of human history. A New York Times Magazine article specifically described Pääbo’s, Reich’s, and Krause’s collective influence on the field of ancient DNA research as a “state-of-the-art oligopoly.”47 Indeed, over the past ten years, some of the biggest and boldest claims in the field of ancient DNA research and human evolution have come from this handful of practitioners.
In 2018, Reich published a fairly comprehensive, albeit contentious, book on the evolution of ancient and modern human populations as told by cutting-edge genome-wide data from the field of ancient DNA research. In this book—Who We Are and How We Got Here—Reich offers a personal account of his own professional research, as well as that of colleagues, that argues for the power of genetic evidence to tell a new and better story about human history. “Ancient DNA and the genome revolution,” Reich claims, “can now answer a previously unresolved question about the deep past: the question of what happened—how ancient peoples related to each other and how migrations contributed to the changes evident in the archeological record.” He suggests that archeologists should be equally excited by this new source of data: “Ancient DNA should be liberating to archeologists because with answers to these questions in reach, archeologists can get on with investigating what they have always been interested in, which is why these changes occurred.”48 As far as he is concerned, the information from ancient genomic data has done much more than inform our view of human history. He believes it has transformed, and will continue to revolutionize, our understanding of who we are, how we got here, and how we relate to one another today.
Although remarkable, the “genome revolution” was, and continues to be, exceedingly controversial.49 As far as some archeologists were concerned, some geneticists were entirely too overenthusiastic about the explanatory power of genetic evidence, to the point they would exclude or downplay other forms of data from established disciplines like archeology, linguistics, and history. Sure enough, some archeologists felt some geneticists tended to embrace, intentionally or unintentionally, a reductionist mindset in terms of their choice of data (molecular data) as unsurpassed evidence for understanding human evolutionary history. Alexandra Ion, an archeologist at the University of Cambridge, pointed out the problem with the idea of ancient genetic or genomic data being a “holy grail” in the sense that this kind of data can always provide novel or better answers to old archeological questions. As an example, she drew on the case of King Richard III, who famously died in battle in the fifteenth century but whose remains and their whereabouts were left uncertain. In 2012, more than five hundred years after his death, researchers excavated a skeleton from underneath a car parking lot in Leicester, England, setting them off on a journey to identify a body that just might belong to the late king. In recounting the events, Ion outlined the ways in which researchers negotiated, and the media presented, the value of different lines of evidence from the genetic to the osteological, archeological, and historical. This multidisciplinary team of researchers knew their ability to extract and sequence genomic data from the ancient skeleton played a large role in identifying Richard III’s body, but they also knew their confidence in this data depended on its correspondence to other evidence.
Although researchers may have recognized this, the media touted DNA as the real definitive proof, the evidence that solved the mystery. Ion expanded on this idea of negotiating evidence by drawing on more substantial works that used ancient molecular data to shed light on the Neolithic Revolution, a major period of transformation as people transitioned from a lifestyle as hunter-gatherers to farmers. She questioned whether the genetic data from the “hard sciences” was really being successfully integrated with the historical and cultural contexts of interest to archeologists (and with their traditional sources of evidence) in the so-called soft sciences. To be sure, archeologists were receptive to new methods and data, and many have forged strong, beneficial relationships with geneticists. Yet the idea that genetic data can always be appropriately integrated with archeological and historical evidence is a problem. As Ion argues, truly interdisciplinary research, when it comes to genetics and human history, is not easily achieved.50
The newfound access to ancient human DNA on a large scale and its increasing application to questions about human history has archeologists, as well as other scholars in the humanities, up in arms for a number of reasons. The issues and arguments are multifaceted, and the sides that geneticists and archeologists find themselves on are not wholly binary.51 Given this, there are a number of worries about the hype around the genome revolution that has particularly affected ancient DNA research’s disciplinary development going forward into the future.
One issue was not so much that archeologists denied the value of genetic data to illuminate answers to historical questions. Rather, the issue was overzealous confidence in genetic data to single-handedly answer big questions about human history through oversimplified and grandiose narratives about the past. In the broadest sense, some scholars—such as archeologists Rachel J. Crellin and Oliver J. T. Harris—identify this as the classic nature-culture binary, which they argue has informed much of ancient DNA research. They suggest that this not only is an inadequate understanding of the world but also leads geneticists, and even some archeologists, to favor genetic data, thus “placing archaeology and material culture in a secondary and subservient position.”52
On one level, archeologists have been troubled by what seems to be geneticists’ unbridled confidence in genetic evidence that competes with their own disciplinary methods and ways of knowing the past, be it material culture or ritual practices as documented through the archeological record. They argue that geneticists, archeologists, anthropologists, and historians alike should understand how their methods and data can complement rather than compete with one another. Their view is that DNA adds to the discussion while disciplines like archeology, history, and linguistics provide the context for the discussion in the first place.
On another level, archeologists, as well as historians, are even more concerned that geneticists’ infringement on their territory may bring up unwelcome and outdated oversimplifications of human history. “Some archaeologists, however, worry that the molecular approach has robbed the field of nuance,” writes Ewen Callaway in another Nature article. “They are concerned by sweeping DNA studies that they say make unwarranted, and even dangerous, assumptions about links between biology and culture.”53 Anthropologist Michael L. Blakey goes so far as to accuse genetics of biological determinism, namely the reduction of all cultural and societal phenomena to biological or genetic causes.54 Such sentiments are far from a lone case of data envy but have much to do with larger cultural, societal, and political issues.
Although there are a number of research papers that illustrate such concerns for archeologists, Reich’s book was a prime case in point and an easy target given both his prominent position in the field of ancient DNA research and the amount of praise he bestows on the field. In his book, Reich argues that ancient genomic research has the potential to study and discuss race on a scientific basis without necessarily being racist. He denies that his work is a form of scientific racism and instead argues that genetics actually transcends the social or cultural category of race concepts while dealing only with the biological facts of it.55 Other scholars have pushed back on what they view to be a naive perception that scientists, even with the best of intentions, can simply separate the biological and cultural.56 Some have argued more strongly, directly accusing Reich and colleagues of an outright racist ideology despite claims they avoid it.57 Indeed, historians of science have pointed out the problems with such attempts to both divorce the biological from the cultural because there are always underlying assumptions, known or unknown, regarding issues of race, gender, ethnicity, and identity.58 As the historian of science Jenny Reardon puts it, “As much as biologists have tried over the last several decades to constrict race to apolitical scientific purposes, the use of race is never neutral. It is always tied to questions with political and social salience.”59
Further, historians of science, along with archeologists, have been arguably more alarmed by some geneticists’ attempts to show how social-cultural categories easily correspond to genetic or other biological categories. They are alarmed when phenomena they perceive as cultural or social are reduced to something biological and when the biological explanation is given priority over all others. Archeologists have viewed much of the work in the field of ancient genomics, or at least the way geneticists talk about it, as intentionally or unintentionally resurrecting many concepts such as biological determinism that are not only outdated but morally problematic.60 Recently, a host of archeologists, historians, and other scholars have highlighted the problems of trying to match biological with sociocultural concepts, especially as they relate to genetic ancestry companies, and the many epistemological and political risks that come with such practices.
To add to the controversy’s complexity, archeologists have also been increasingly worried about the ways in which ancient DNA research is being conducted. Indeed, a large part of this angst over the use of DNA to answer big questions about human history has stemmed from the fact that labs like Reich’s are growing into large-scale industrial operations managing the production and distribution of ancient DNA data. In his book, Reich is open about his objective to make “ancient DNA industrial” by transforming his lab into an “American-style genomics factory.”61 This science-turned-business philosophy has rubbed some researchers the wrong way. At one extreme, critics accuse Reich’s lab of biocolonialism. Maria C. Ávila Arcos, a population geneticist at the International Laboratory for Human Genome Research in Mexico, notes that Reich’s objective to industrialize the science carries insensitive undertones: “When one considers the social and historical context of the human populations that will be studied—many of which have been historically marginalized, colonized, and exploited—this statement becomes problematic.” According to Ávila Arcos, “Such intentions could easily be perceived as a continuation of exploitation or biocolonialism.” On this point, she argues that Reich’s own use of an “unfortunate analogy further highlights the problem.” She quotes Reich, who wrote in his book, “We are . . . like explorers in the late eighteenth century, sailing to every corner of the globe.” As Ávila Arcos explains, “During the era to which Reich refers, European adventurers indeed collected samples from around the world, but these specimens were usually taken without the consent of, or regard for, the communities to whom they rightfully belonged.”62
Others have argued that this colonialist attitude extends to more than the human populations being sampled and studied. In a feature published in the New York Times Magazine, writer Gideon Lewis-Kraus revealed that the big labs—Pääbo at Leipzig, Reich at Harvard, and Krause of Jena—are rumored to exercise power over some of the choicest human fossils.63 These labs, Lewis-Kraus explains, enjoy access to money, technology, fossils, and top-tier scientific journals that makes it hard for smaller labs to compete or pressures them to collaborate. Consequently, many native fossils are said to be outsourced to the researchers of these bigger labs. According to Lewis-Kraus, one scientist told him, “Certain geneticists see the rest of the world as the 19th-century colonialists saw Africa—as raw material opportunities and nothing else.” This has contributed to an “atmosphere” of “anxiety and paranoia.”64 In fact, in his own conversations with archeologists and geneticists, Lewis-Kraus said that nearly all scientists asked for anonymity because they were concerned about professional backlash.
The search for DNA from ancient and extinct organisms has always been a high-profile research practice, and the recent rush to sequence ancient human genomes and rewrite human evolutionary history has only exacerbated the attention afforded to it by the media. According to interviewees, some feel the vast amount of data, the conclusions being made from this data, and the ever increasing celebrity status of the field is perhaps moving too fast for the field’s own good. One interviewee likened the present state of the discipline to its early phase of research in the 1990s: “This research discipline has developed the way that all science—new scientific disciplines—develop, in that you have an initial, wonderful discovery, you have lots of hype and high expectations, and then you come down to it with a bump, and then you do the hard work of working out what it all means and what you can really do; what is realistic and what isn’t. And that may take the next ten to twenty years of that research discipline.” For this scientist, the community might currently be experiencing a second hype cycle: “I think with these next-generation sequencing techniques we have to do it all again; come down to it with a bump, and sort out what we can and can’t do. So, I think it’s cyclical” (Interviewee 5). What is distinctly different, however, about this phase of the discipline’s development is that the implications for this sort of hype, and the consequences of such failed or misaligned expectations, are more serious than ever. In scientists’ explicit confidence in their ability to rewrite human evolutionary history is the much more implicit claim that they can address the multifaceted political, cultural, and national identities woven into the history of people moving across the world and mixing with one another in the process. Indeed, the ethical stakes for this type of hype are profound.
SECOND HYPE CYCLE
The study of ancient DNA data had previously been limited to the study of mitochondrial DNA and sometimes nuclear DNA. Recently, however, the potential to sequence whole genomes via high-throughput sequencing technologies has allowed researchers to produce an increased amount of higher quality data (from several sequences to billions of sequences) that permits them to more accurately quantify contamination and therefore guarantee DNA authenticity. It has also allowed them to study the entire genomic makeup of an organism, similar to how modern genomes are analyzed, and this has provided more detailed answers to questions regarding phenotype, adaptation, and evolution, together with documenting when migration and gene flow events have occurred. As a result, researchers have recently reported that the “field” has “entered the new era of genomics and has provided valuable information when testing specific hypotheses related to the past.”65
In interviews with ancient DNA researchers, some have suggested that the innovation of NGS of the early 2000s has ushered in a second hype cycle, much like the first hype cycle that the field experienced in the 1990s with the advent of PCR. Specifically, some feel the race for the first or oldest genomes is reminiscent of the race for the first or oldest DNA from ancient and extinct organisms. “I think a lot of the whole genome stuff,” said one interviewee, “is just being driven by ‘We’re the first person to sequence the genome of extinct species X.’ . . . And it’s almost like the very early days of ancient DNA when you could get a Nature paper by saying, ‘Ancient DNA recovered from extinct thylacine or quagga or Egyptian mummy or mammoth or whatever.’ ” As this scientist further explained, “It didn’t really matter what the answer was. It was just the fact that you could do it. And I think that’s possibly what’s driving a lot of the ancient DNA community at the moment—is just again being the first to do something, not necessarily answering an intelligent question” (Interviewee 25).
Indeed, there seem to be parallels between the early years of ancient DNA research’s disciplinary development in the heyday of the PCR era and the field’s current optimism, with specific attention to the rhetoric of revolution surrounding the study of ancient humans across the world and over the centuries. One researcher, for example, said, “Several really big names in ancient DNA, they jumped onto the human train. I guess if they had all decided to work on megafauna, there would have been a bunch of papers already. It’s coming, as soon as Science and Nature get tired of yet another ancient human genome paper. That’s going to be the second wave in the next three to four years. You’re going to start seeing all these extinct animals; their genomes sequenced, their population data sequenced.” This researcher continued, “Scientists know that there are certain types of analyses that the media would go more crazy about than others, right? So, if you have the choice between . . . sequencing the genome of some random ancient human person and sequencing the genome of Richard III . . ., you’d go for sequencing Richard’s genome because you know the media are going to go ape shit. You know ultimately that’s going to lead to a higher likelihood of landing a grant” (Interviewee 38).
Furthermore, some feel the vast amount of data, the conclusions being made from this data, and the ever increasing celebrity status of the field is perhaps moving too fast for the field’s own good. A leading scientist, for example, presented this perspective: “We have entered into another phase . . . where everybody thinks it’s just so fucking amazing, right? . . . I think they will be super surprised in ten years from now—five or ten years from now—in terms of a lot of those claims need to be modified! And I think that we haven’t by any means understood the limitations of what we are actually doing with genomics. And I think, you know, to be honest, I’m so surprised how the ancient genomics era has just been taken in by the anthropological community without questioning anything.” The ancient genomics era was similar to the ancient genetics phase of the 1990s in terms of its exciting potential and exploratory nature. However, according to this interviewee, there were distinct differences too: “You can say the problem is not, I think, so much from the contamination. The problem is another kind now. . . . Now, it’s the data analysis, really. It’s the way you do the data analysis and it’s the interpretations you are taking from that data analysis. . . . I can see already now that there’s issues there, and I’m sure that there will be more to come. [Laughs] I think people will be pretty shocked” (Interviewee 7).
This is especially a problem because of the highly publicized nature of this line of research and the understandable tendency of media reporters to tell a clean and simple story that often does not do justice to the complexity of the research. As Ion argues in the case of King Richard III, for example, the media emphasized the role of DNA in identifying the skeleton over other lines of evidence that were equally, if not more, important. Anna Källén, an archeologist at Stockholm University, and colleagues found something similar in their analysis of the scientific research article and subsequent popularization of the famous Birka “warrior”—a skeleton discovered in a tenth-century burial chamber. When originally uncovered in the late 1870s, researchers assumed the skeleton belonged to a man, because the remains were buried with warrior equipment dating back to the Viking Age. More than a century after this initial discovery, scientists used ancient DNA data to determine that the skeleton was in fact female. In Källén and colleagues’ study of media coverage on the recent findings, they found that the conclusions were communicated to the public by drawing on popular narratives and current political debates.66 Likewise, archeologists Catherine Frieman (Australian National University) and Daniela Hofmann (University of Bergen) demonstrate how ancient DNA research on human population migrations across Europe have been exploited by far-right groups with racist, nationalistic, and political agendas. As Frieman and Hofmann argue, the blame for such misappropriation of ancient DNA research cannot be placed on one person or group, be it the media or the public. Nonetheless, scientists have a role to play in actively engaging with the implications (intended or unintended) of their research, especially given the intense press and public attention that accompanies it.67
Over the past three decades, the discipline has developed into what some see, despite this exploratory or experimental phase, as a more established practice in evolutionary biology. While there is certainly continuity regarding the interplay between science and the media from the PCR to the NGS eras, namely scientists’ need for the press to maintain momentum to continue to be competitive in the field, there does seem to be a distinct difference. The hype around the search for ancient DNA is far less about resurrecting dinosaurs today, although this narrative certainly colors media reports and public discourses on new discoveries in the field. Rather, scientists’ newfound technological and financial capacity to sequence entire genomes of archaic human specimens has led to expectations on another level, the ability of scientists to use ancient DNA data to rewrite our understanding of human evolutionary history. While the media plays a role in the hype, ancient DNA researchers are the ones explicitly promoting this promise. Ancient DNA researchers are indeed making more and farther-reaching claims about human origins, history, migrations, and admixture. Further, these claims also encompass centuries of historical, sociological, and cultural controversies over political and cultural identity. The hype around scientists’ work as it relates to the insights ancient DNA data can provide for the study of human history has sweeping consequences for the broader public, sometimes with undesirable and potentially dangerous implications to empower outdated racist, nationalist, and political agendas.