CHAPTER NINE

Celebrity as Strategy

AN ANSWER IN SEARCH OF A QUESTION

Throughout the history of ancient DNA research, the practitioners working in this new field employed a number of strategies to aid their exploration of it. In their pursuit of DNA from fossils, scientists engaged in a data-driven strategy in terms of pursuing valuable samples, state-of-the-art technologies and techniques, and the molecular information (DNA) that could be recovered from accessible samples using those existing technologies. At the same time, they engaged in a question-driven strategy by asking and answering questions about the theoretical preservation and potential extraction of DNA from fossil material, and asking historical or biological questions about the organisms or populations under investigation. But researchers also adopted a further research strategy: a celebrity-driven strategy. With this strategy, scientists capitalized on publicity opportunities and created their own opportunities for attention too in order to communicate the excitement as well as value of this new line of research to professional, public, and political audiences. In doing so, scientists used the celebrity that surrounded the science (often by way of the charismatic creatures that were sampled and the potential to recover ancient genetic data from them) to drive the scientific and technological development of ancient DNA research in much the same way that the more traditional processes of data collecting or hypothesis testing did.

In marking the anniversary of the field’s birth, the organizers of the “Ancient DNA: The First Three Decades” conference—Erika Hagelberg, Michael Hofreiter, and Christine Keyser—noted that the search for DNA from fossils had evolved beyond studies that were “purely technical” or “one-off historical puzzles.” Today, ancient DNA researchers were “addressing a growing number of important scientific questions.”1 In other words, ancient DNA research was no longer solely technology-driven and sample-driven. At least according to these practitioners, they felt that it had matured as scientists were asking and answering more scientifically significant questions on a greater scope and scale. This change was the consequence of the introduction of NGS in the early 2000s and a subsequent shift from their technological dependence on PCR and Sanger sequencing. Indeed, for a number of interviewees, NGS freed researchers from PCR’s constraints, thus allowing them to focus on the biological questions rather than the technological limitations: “For the first time in history,” remarked one researcher, “I think we’re not driven at all by the technology because the technology is permissive today. We are driven by the question we can answer with the technology. Well, it’s not really that yet, but it’s close to it” (Interviewee 8). According to another interviewee, it was more than the technology-driven nature of the practice that had passed: “I think that we are question-driven rather than sample-driven.” For this interviewee, “now that all the low-hanging fruit have been picked it’s more question-driven” (Interviewee 43). While scientists certainly recognized the role that technology and samples had played in the practice of ancient DNA research, they viewed this shift to a more question-driven approach as a mark of maturity. Maturity was desirable to scientists, especially amid credibility concerns.

In fact, reflecting on the field’s thirty-year history, many other interviewees portrayed the practice in its early days as a primarily data-driven practice in terms of technology and samples. One interviewee even characterized the search for DNA from fossils as an answer looking for a question rather than a question looking for an answer (Interviewee 2). For some, the field was full of studies in which the answers, the DNA, seemed to supersede the questions that could be asked of the DNA. To be clear, this phrase, “an answer in search of a question,” was a shorthand for studies that some scientists saw as being data-driven in approach. Such phrasing presupposed that ancient DNA sequences held relevant molecular information about the organisms they came from and their evolutionary history. Although true, actually accessing and then understanding the data required scientists to make sense of the DNA by analyzing, interpreting, and appropriately applying it to questions in evolutionary biology. In other words, the “answers” were not explicit in the sequences themselves. Scientists had to examine the data in order to make meaning of it. Nonetheless, scientists in the field often seemed to be data-driven in their approach in terms of the samples, the technology, and the molecular information that could be recovered from accessible samples using existing technologies.

This data-driven strategy, as other interviewees argued, was often coupled with a celebrity-driven approach to the search for ancient DNA. “They may have a research question,” explained one paleobiologist, “but sometimes it’s even pre-getting-a-research-question. It’s like, ‘Let’s study these. Let’s see if there’s DNA in these fossils.’ ” This particular practitioner shared the following story about the priorities that went into their decision-making process for initiating research projects: “I remember one occasion when one of the well-known ancient DNA researchers said to me, ‘What species should I study?’ . . . It got to the point to where there would be Ph.D. students and you could see the supervisor thinking, ‘What species hasn’t anyone done yet? Nobody’s done musk ox. Ok. You do musk ox.’ Without a very clear question.” This data-driven approach seemed to be partly propelled by press and public interest in charismatic creatures that would yield data but also be likely to lead to high-impact publications: “I’ve seen several examples of ‘let’s blitz this species.’ We give a Ph.D. student this species. They collect fossils from all over, they do the DNA, they draw up trees, and then they start to ask questions. . . . And then the supervisor is usually then looking for a high-impact angle,” this interviewee said, laughing. “It is a slightly odd way of doing science” (Interviewee 3). This interviewee, as well as others, critically portrayed the science in its early years as a data-driven and even celebrity-driven practice.

To be clear, however, the search for DNA from fossils, even in its early era as a data-driven and celebrity-driven science, was in fact a question-driven one too. In the early studies, for example, researchers were indeed driven by questions—questions about the theoretical preservation and potential extraction of DNA from ancient and extinct organisms. And answering these questions was no small feat. Throughout the 1980s and 1990s, scientists confronted extreme technical challenges as they sought to discover what was possible regarding the preservation and extraction of DNA from ancient skins, tissues, and even bone. But for some studies, the questions also took on a biological bent. For example, the 1984 quagga study was initiated as a theoretical and technical challenge, but the specimen, Equus quagga, was specifically selected in order to test a hypothesis about the evolutionary history of an extinct species, one that was previously inconclusive based on fossil data alone.2 Likewise, researchers working on one of the early studies to try to extract DNA from insects in ancient amber chose a particular termite specimen, Mastotermes electrodominicus, in order to test hypotheses of insect evolution and extinction.3 In these cases, there was a biological question but it was secondary to the technical achievement of recovering DNA from the fossil in the first place.

In the early days, it was nearly necessary for the technical question to take precedence as the biological or historical question could not be answered without it. But precedence did not entail exclusion. In fact, in reality, data-driven, celebrity-driven, and question-driven approaches were not incompatible. Rather, they often went hand in hand. Such a mixed-method approach was not only possible but pragmatic. It was realistic, even inevitable. Sure enough, the use of one strategy was not exclusive of another. Nor was one strategy employed equally by all practitioners and to the same degree or frequency throughout the discipline’s development. In other words, practitioners were prudent in choosing a combination of research strategies according to their circumstances, objectives, and the perceived pressures at the time.

While some scientists made arguments that they were entering into a more question-driven (and, in their view, a more mature) research era, other evidence—including other interviewee quotes—suggests otherwise. Indeed, a handful of interviewees argued that ancient DNA researchers were still very much driven by the available samples and technology. Indeed, according to several scientists, ancient DNA activity has continued to be an answer looking for a question. In 2015, for example, a team of researchers—including Morten E. Allentoft and Eske Willerslev in Copenhagen—sequenced 101 ancient human genomes ranging in date from A.D. 700 to 3000 B.C. with the ultimate goal of testing hypotheses about evolution and migration during a time when new tools and traditions had surfaced and spread across Eurasia.4 Though question-driven, the project was also very much propelled by the samples and technology. Researchers went over the top to generate more genomes than necessary simply because they could. Reporting for Nature, Ewen Callaway specifically spotlighted this research and took note of its data-driven approach. Callaway quoted Allentoft: “ ‘We could have stopped at 80,’ says Allentoft. But ‘we thought, “Why the hell not? Let’s go above 100.” ’ ” With NGS, the issue was no longer too little data but rather too much data. On this particular point, Callaway quoted Greger Larson at Oxford University: “ ‘It’s an interesting time, because the technology is moving faster than our ability to ask questions of it,’ says Larson, whose lab has also amassed around 4,000 samples from ancient dogs and wolves to chart the origins of domestic dogs. ‘Let’s just sequence everything and ask questions later.’ ”5 The trend continues today.6

Although there is certainly continuity between ancient DNA’s data-driven past and present, the primary difference is that the situation has shifted from scientists having too little data to having too much data.7 One practitioner, for example, commented on the impact this had on the research process: “We got some new genomes and it wasn’t question driven anymore. We didn’t have a look at those genomes because they were the key to a question, but [because] they were good samples and we could get whole genomes” (Interviewee 13). For a second scientist, this approach was a general consequence of new opportunities afforded by available technology as well as samples: “I think whenever a new technology comes on board there’s a lot of ‘Ta-da! Hey, we analyzed this stuff with this new technology.’ And it’s really driven by the labs that have access to the technology and the samples” (Interviewee 30).

Despite the fact that some of the earliest researchers were question-driven in their approach to the search for DNA from fossils, numerous interviewees in their memories of their history characterized this early research as being primarily data-driven. They tried to draw a line between what they viewed as a data-driven past and a more question-driven present in order to place some temporal and methodological boundaries around the evolution of their practice. In other words, this language of an answer looking for a question, rather than a question looking for an answer, was an extended episode of retrospective boundary-work. This language was a way in which scientists sought to compose a narrative of ancient DNA activity, intentionally or unintentionally, by drawing a line between its emergence and what some scientists see now as its more or less established status today. They engaged in this sort of retrospective boundary-work because they were concerned about their credibility within evolutionary biology, something that was challenged by both contamination concerns and what some viewed as disproportionate or undeserved media attention around sensational publications.

Crucially, the issue was not about whether ancient DNA research was science or non-science but whether it was a credible or non-credible approach to the study of evolution. The answer was far from simple as ancient DNA activity was tied up in a long history of scientific and popular expectations. Given that the search for DNA from ancient and extinct organisms had evolved into a discipline on a public platform, scientists felt they could not solely rely on technology or methodology in terms of protocols or verification as a way to draw lines between what they saw as reliable or less reliable work. They felt their public profile required a public response about the proper practice of ancient DNA research.8 As a result, practitioners created criteria in the lab in response to contamination concerns, but they also built boundaries via rhetoric, especially through their memories of their history, in response to celebrity concerns.

Overall, demarcation mattered for scientists because the act of setting themselves and their work aside as reliable and rigorous signified relevance within evolutionary biology more broadly. Scientific maturity indicated authority, and this mattered for ancient DNA researchers coming out of a thirty-year history of credibility contests over contamination and celebrity concerns. To this end, the point is not to judge whether this data-driven and celebrity-driven approach was, or indeed still is, a positive or negative phenomenon in the world of ancient DNA research. Rather, the point is to highlight the fact that scientists practiced science in a way that was influenced by a want and need for both data and publicity, and that they themselves interpreted these influences as affecting the production of knowledge and their scientific status within evolutionary biology.

Scientists’ efforts to distinguish credible from less credible research was not unusual. In the history and philosophy of science, demarcation is a well-known topic of debate.9 In fact, the demarcation issue has a long history of heated discussions about the more or less correct ways, and even wrong ways, of practicing science and what gets to count as proper science.10 The famous philosopher Karl Popper argued that science could be distinguished from non-science by the fact that a given hypothesis could be tested and proved false. Popper’s criterion of falsifiability, also known as the testability or refutability of a hypothesis or theory, still holds strong as a benchmark for demarcating science from non-science, as well as good science from bad science.11 In fact, hypothesis-testing as a criterion for proper scientific practice had often been given a privileged position over other methods of inquiry.12 Indeed, with the rise of big data across the scientific disciplines, and what many refer to as a new mode of data-driven scientific research, the place of and preference for hypothesis-driven inquiry has come back into the spotlight.13

QUESTION IN SEARCH OF AN ANSWER

Philosophers today are increasingly interested in this phenomenon of data-driven science. In 2012, for example, a group of scholars approached this topic with an intent to identify its characteristics as well as its causes and consequences for the production of scientific knowledge. They were also interested in trying to understand the role that hypotheses and theories played in this sort of methodology. According to their studies, they found that data-driven sciences often value the process of induction from given data as a legitimate approach to scientific inference and the role of technology as a means of analyzing and then extracting significant patterns from the data. In all of this, philosophers have asked a further question of this particular phenomenon: does this data-driven approach constitute a novel approach to scientific inquiry or does it share similarities with past research practices?

In his commentary on a selection of these papers, Bruno J. Strasser looked for overall similarities and differences among data-driven sciences over the past few centuries.14 As far as he is concerned, early natural history “wonder cabinets”—collections of oddities from geological, historical, or religious relics—were not so dissimilar from “electronic databases” of today. According to Strasser, “Renaissance naturalists were no less inundated with new information than our contemporaries.” Indeed, “The expansion of travel, epitomized by the discovery of the New World, exposed European naturalists to new facts that did not fit into the systems of knowledge inherited from the Greeks and Romans.”15 Through a study of Carl Linnaeus (an eighteenth-century physician and botanist from Sweden), scholars Staffan Müller-Wille and Isabelle Charmantier specifically spotlighted the data-driven nature of natural history in terms of the various strategies Linnaeus employed to organize and analyze what can be called an “information overload” of new species data. First, these authors draw attention to the fact that Linnaeus used new tools—such as dichotomous diagrams, files, and indexes—to help control the amount of data under study. Interestingly, however, these tools, initially intended to control the amount of data, facilitated an influx of data. The authors point out that in the midst of this data deluge, Linnaeus attempted to make sense of the information by generating a hypothesis, namely the genus concept as a distinct category, as a further means for organizing the information, then classifying and comparing organisms accordingly.16 Strasser puts the point this way: “In other words, Linnaeus may have been driven by his data, but his approach was not exclusively data-driven.”17 This example showcases ways in which data-driven inquiry may not be so new to contemporary scientific and technological practices. Rather, natural history has a long tradition of producing, then dealing with, information overload. Further, natural historians, like contemporary scientists, were also open to pursuing mixed research methods such as data-gathering and hypothesis-testing to make sense of the world around them.

Past and present data-driven practices have distinct differences. For Strasser, three features set the contemporary data-driven sciences apart from former natural history practices: (1) the data analysis today is done by researchers from disciplinary backgrounds different from the individuals who produced it, (2) the data analysis depends on the use and understanding of statistical tools, and (3) the data is primarily generated from inside the lab and not the field, as was typical of previous natural history practices. Strasser also attempts to explain why so many scientists view data-driven inquiry today as uniquely overwhelming and even revolutionary: “To conclude, it is mainly because the experimental sciences took the upper hand over natural history in the late nineteenth century and have since come to dominate the public perception of science that data-driven research is now perceived as a novel feature of twenty-first century science.” Yet, historically minded case studies demonstrate that this data deluge is nothing new: “Natural history had been ‘data-driven’ for many centuries before the proponents of postgenomics approaches and systems biology began to claim the radical novelty of their methods.”18

In her book Data-Centric Biology: A Philosophical Study, Sabina Leonelli makes an additional observation about the data-driven sciences of today. According to Leonelli, data-driven sciences are not necessarily interesting because they are data-driven but because of the social, organizational, and institutional structuring required to produce, then analyze, the massive amounts of data that are a part of the practice.19 In her case study, she focuses on plant system databases in model organism biology to map out what she calls data-journeys. Here, she is interested in outlining the ways in which researchers across the board work together to collect, integrate, analyze, and share various sources of data that eventually will be used for different scientific purposes. Leonelli’s focus on data-driven science is much more about the process than the product.

Likewise, ancient DNA researchers seem to be making moves to operate under similar large-scale organizational systems. Ancient DNA researchers, in light of new whole-genome sequencing technologies and techniques, face new challenges and are trying to generate a new kind of institutional infrastructure in response. Indeed, ancient DNA researchers are finding that a whole host of resources are required to go from a sample to a sequence to meaningful scientific analysis in a reasonable time frame. In response to opportunities offered by technology, some scientists are responding by building large-scale business-like operations that oversee the production and distribution of ancient DNA data. Svante Pääbo’s lab at the Max Plank Institute for Evolutionary Anthropology, Eske Willerslev’s lab at the Center for GeoGenetics in the Natural History Museum of Denmark, and David Reich’s lab in the Department of Genetics at Harvard University are examples of the industrial operation that ancient DNA research can require. Much of this go-big-or-go-home approach to the search for DNA from fossils is of scientists’ own doing.

Most recently and obviously, new whole-genome sequencing technologies have pushed ancient DNA researchers to seek additional skills in statistics, bioinformatics, and population genetics in order to analyze the massive amounts of data that can now be extracted from hundreds of samples. Crucially, ancient DNA researchers are much more than a user community of the machinery. They are committed to developing new methods that can be used in the lab to optimize the extraction and sequencing processes. In other words, although new technologies and techniques are critical to ancient DNA activity and the extent to which data can be made available and analyzed, practitioners do not just draw on developments in other fields but instead are active in adapting these innovations for their own purposes. Ancient DNA requires manipulation and management of data because the nature of ancient DNA is not the same as that of modern DNA. The extraction, sequencing, and analysis of degraded and damaged DNA requires a specialist skill set to understand the biochemistry of DNA damage and to correctly infer how differences between sequences relate to differences among individuals and populations over time.20

This shift to ancient DNA research as an industrial operation highlights the changing ways the field has become or might be even more data-driven in the future. Over a thirty-year period, the practice evolved from an initial effort to try to extract DNA from fossil material with the technology of PCR to a big-risk, big-reward initiative as practitioners set out to maximize the amount of genomic data that could be produced using NGS. In fact, some labs sought to turn the science of ancient DNA research into a truly large-scale, industrial, and automated process.21

Data-driven approaches have been and continue to be a principal part of the search for DNA from fossils. Traditionally, philosophers of science and scientists themselves have tried to divide scientific inquiry into data-driven versus hypothesis-driven. This binary view, however, is changing as scholars bring attention to the fact that data-driven research is often pursued in combination with other modes of scientific inquiry. For example, a number of scholars have addressed the role of exploratory experimentation in the data-driven sciences.22 In his case study of systems biology, Ulrich Krohs makes the point that all science does not need to be hypothesis-driven at all times, nor is it even practical. Krohs, arguing against the classical conception that the goal of experimentation is to test hypotheses, suggests that “other modes of experimentation,” such as the “searching mode of exploratory experimentation” as well as “data driven research,” can be considered as “serious epistemic strategies, besides, and in combination with, hypothesis driven research.”23 Other philosophers also argue for the need to make more room for an interplay of approaches when it comes to understanding the process and practice of science. As philosopher of biology Maureen A. O’Malley writes, “It is possible that theory-driven hypothesis testing has been conceived of by scientists, science funders and philosophers in a way that does not exist in practice (and never has), and that it is closer to and involves more interplay with exploratory experimentation as well as natural history experimentation than we have tended to think.”24 As the history of ancient DNA research has demonstrated, a celebrity-driven strategy—often in combination with data- and question-driven approaches too—has been a frequent and fruitful driver of the field.

The Neanderthal Genome Project is just one exemplary case of this. In 2006, shortly after the introduction of NGS and its initial application to a number of ancient DNA studies, Pääbo and the MPIEVA in Leipzig, along with 454 Life Sciences Corporation, announced they would be the first to attempt to sequence the entire Neanderthal genome. Through an orchestrated press conference and press release, they announced they would accomplish the task in a mere two years.25 From the outset, the Neanderthal Genome Project was a substantial, and quite intentional, media production that played on the technology, celebrity, and research impact of the findings eventually to be produced. This was not unprecedented. For example, science studies scholar Stephen Hilgartner argues for the increasingly intense “media-orientation” of genome researchers during the days of the Human Genome Project (HGP). For Hilgartner, “science-media coupling” was “strategic interaction.” He suggests that these genome researchers turned to the media in the face of competition in the race to sequence the human genome. Indeed, they were very conscious of their behavior and orientation toward the media. As Hilgartner explains, “HGP leaders, for their part, arguably did what the managers of any enterprise would” regarding their decisions to “react strategically to emerging events” and “tailor media messages that would defend their legitimacy.”26

SCIENCE IN PRACTICE

In their recollections of their history of ancient DNA research, scientists attempted to draw a distinction between what they saw as a data-driven and celebrity-driven phase of research versus a more question-driven methodology. The sometimes derogatory or dismissive comments by some interviewees about earlier practitioners, or even practitioners today, as scientists merely chasing samples, technology, and even celebrity, can be viewed as an extended episode of boundary-work, an attempt to create rhetorical and epistemological distinctions between the field’s past and present. Their views of how others in the field utilized certain strategies, either exclusively, appropriately, or inappropriately, is a matter of opinion for what they viewed to be the proper processes and practices of science, whatever those may be. In doing so, the individuals interviewed were aligning themselves with one scientific approach over another. According to some of them, being question-driven rather than sample-, technology-, or celebrity-driven was a hallmark of scientific maturity.

Although practitioners’ boundary-building was sociologically important for establishing their identity and authority within the scientific community, this boundary-work was also naive given that philosophers today no longer necessarily see a distinction between data-driven and hypothesis-driven research, and scientists do not actually practice science in such a binary way. Instead, more recent philosophical viewpoints argue for the need to make room for more forms of inquiry in scientific practice. In line with this, a celebrity-driven strategy—as clearly and consistently utilized by ancient DNA researchers themselves—can be considered a “serious epistemic strategy” that practitioners, as well as editors and funders, employ when making choices about research agendas, publication acceptance, and grant funding. Certainly, not all research was guided by (or need be guided by) its potential to attract popular interest, but in this history, the celebrity that surrounded the science of ancient DNA research was a crucial consideration behind researchers’ decisions which influenced their process of data-gathering and hypothesis-testing.

This either-or characterization by interviewees of a data- versus question- versus celebrity-driven approach to the search for DNA from fossils is to some extent misleading. In fact, ancient DNA researchers used a variety of approaches, sometimes foregrounding the appeal of celebrity over the question under study, other times prioritizing the technology over the celebrity. In reality, scientists—ancient DNA researchers included—use multiple approaches simultaneously and iteratively, assessing the achievability of research results against accessibility to technology, samples, and funding against the prestige and publicity they could gain. Indeed, at times in the discipline’s development, especially with the introduction of a new technology, scientists prioritized the use of this technology to the extent that sometimes it did more explanatory work than the celebrity of it. At other times, scientists emphasized or downplayed the celebrity of their work, all depending on what they wanted to accomplish. Regardless, ancient DNA researchers were opportunistic and pragmatic, even if some found it somewhat disagreeable.