The 1953 discovery of the structure of deoxyribonucleic acid (DNA) by Francis Crick and James Watson ushered in a new era in the biological sciences. Just at the moment that the evolutionary synthesis had become the core theoretical construct in biology, a revolution in molecular biology was quickly emerging. Watson and Crick discovered DNA’s double-helical structure by building on discoveries by biochemists including Oswald Avery, Maclyn McCarty, and Colin MacLeod (their work showed that nucleic acids, not proteins, constituted genes),1 Erwin Chargaff (his greatest find—that in all organisms the ratio between the nucleic acids adenine and thymine to guanine and cytosine is always 1:1),2 and Rosalind Franklin3 and Maurice Wilkins (whose work in X-ray crystallography helped conceptualize the double helical structure4 of DNA).5
Until the 1960s, molecular and evolutionary biology developed largely independently of one another. Molecular and evolutionary biologists were trained in different scientific traditions (an emphasis on biochemistry versus organismal biology) with different visions of what biology should be. By the early 1960s these divisions hardened, even between colleagues in the same department. For example, at Harvard in the early 1960s, according to historian Michael Dietrich, James Watson, then chair of Organismic and Evolutionary Biology, “helped polarize the department into what he believed were those working on the ‘cutting edge’ of biology in molecular biology and those languishing with the concerns of ‘classical’ biology such as evolution and systematics.” The divisions between the two groups of biologists ran deep. The evolutionary biologist Theodosius Dobzhansky once called molecular biology a “glamour field” that was “intellectually shallow.” But Dobzhansky, drawing on his own intellectual legacy as a unifier, sought to reconcile these two distinct, “though overlapping and complementary,” approaches to biological research. Dobzhansky proposed a compromise that sought to understand both approaches as “essential for understanding the unity and diversity of life at all levels of interpretation.” On the one hand, molecular biology was shaped by a Cartesian or reductionist approach, which understood biological phenomena in the context of chemistry and physics. On the other hand, a Darwinian approach sought to understand biological phenomena in terms of the “adaptive usefulness of structures and processes to the whole organism and to the species of which it is part.” Ultimately, though, as Dobzhansky famously noted, “nothing makes sense in biology except in the light of evolution.”6
However, by the mid-1960s, a new generation of evolutionary biologists—including Richard Lewontin and Jack Hubby at the University of Chicago, working on protein electrophoresis, as well as those experimenting with molecular data to make possible “a quantitative estimate of the ‘genetic distance’ between different species”—fostered significant collaborations between the fields.7 But throughout the remaining decades of the twentieth century, despite such bridges, the two approaches remained relatively distinct. The molecular biologist Michel Morange argues that, “although both disciplines use the word ‘gene,’ they have not sought to bring their two meanings closer, or even to confront them. For the molecular biologist, a gene is a fragment of DNA that codes for a protein. For the population geneticist, it is a factor transmitted from generation to generation, which by its variations can confer a selective advantage (positive or negative) on the individuals carrying it.”8
With the announcement of the Human Genome Project in 1989, it seemed as if molecular biology had prevailed. Genomics, an approach to biology that utilizes molecular technologies to sequence regions of, or an entire complement of, an organism’s DNA (known as a genome), promised to revolutionize science and medicine. According to a 1988 National Research Council report, the primary goals of the Human Genome Project were (1) to construct a map and sequence of the human genome; (2) to develop technologies to “make the complete analysis of the human and other genomes feasible” and to use these technologies and discoveries to “make major contributions to many other areas of basic biology and biotechnology”; and (3) to focus on genetic approaches that compare human and nonhuman genomes, which are “essential for interpreting the information in the human genome.”9
Many in the academic community, including both natural and social scientists, feared that the reductionism so central to both the epistemological and ontological approaches of the Genome Project would reignite a form of biological determinism not seen since the days of social Darwinism and eugenics.10 Some also feared that racial science would emerge with renewed vigor in the genomic age.11 To attempt to address these fears at the outset of the Human Genome Project, James Watson called for at least 3 percent of the Genome Project’s annual budget to go toward the study of ethical, legal, and social implications of genome research, as well as the formulation of policies to address such issues. Thus was born the Ethical, Legal, and Social Implications (ELSI) Research Program of the Human Genome Project. The bioethicist Eric Juengst hoped that ELSI would “help optimize the benefits to human welfare and opportunity from the new knowledge, and to guard against its misuses.”12 Since its official formation in 1990, ELSI has funded more than $100 million in research grants.
It remains too early in the genomic revolution to tell what the long-term impact of genomics on the state of the biological race concept will be. The announcement of the completion of the human genome’s draft sequence at a June 2000 White House ceremony suggests, however, that the race issue is not going away. At that event, President Clinton, flanked by genome sequencers Francis Collins and Craig Venter, announced the completion of a draft sequence of the human genome. Collins, head of the National Human Genome Research Institute, and Venter, then president of Celera Genomics, had fiercely competed to complete separate draft sequences. But in a spirit of cooperation brokered by the White House, the two scientists together offered their genomic gift to the world—one that is enhancing our understanding of human biology and, in turn, helping public health and medical professionals prevent, treat, and cure disease.13
On that day, Venter and Collins also emphasized that their draft genome sequences were confirming what many natural and social scientists had been arguing for decades, that human genetic diversity cannot be captured by the race concept and that all humans have genome sequences that are 99.9 percent identical. At the White House ceremony Venter said, “The concept of race has no genetic or scientific basis.”14 A year later, Collins wrote, “Those who wish to draw precise racial boundaries around certain groups will not be able to use science as a legitimate justification.”15 These conclusions were not novel. Research conducted by population and evolutionary biologists, anthropologists, and historians has shown since the 1930s that racial typologies are not good markers of human genetic diversity,16 that human populations differ from one another genetically in the relative frequency of alleles,17 and that the concepts of race and racism have identifiable and changing histories.18
Yet since the completion of the draft sequences and the statements on race by Venter and Collins, many still hold fast to the belief that race is, in fact, a biologically meaningful classification.19 There have been two general approaches to justifying use of the race concept in genomics. One is exemplified by Neil Risch, a University of California, San Francisco, statistical geneticist and genetic epidemiologist. Risch believes that “identifying genetic differences between races and ethnic groups…is scientifically appropriate.” He argues that race is essential to help determine “differences in treatment response or disease prevalence between racial/ethnic groups” and strongly supports the “search for candidate genes that contribute both to disease susceptibility and treatment response, both within and across racial/ethnic groups.”20 Such an approach exemplifies scientists who utilize the race concept in genomic research and who claim that technological and methodological improvements allow them to examine human diversity with increasing precision that is disconnected from any social prejudices about human difference. Critical of this approach are natural and social scientists who insist that the race concept is a flawed, inaccurate way to measure human genetic diversity that is inseparable from social prejudices about human difference.
A second and more common approach to the race concept among genome scientists is to conflate these two seemingly contradictory viewpoints. And this is the very paradox of the genomic age when it comes to race: scientists say that race does not accurately capture human genetic diversity, yet at the same time, some of those same scientists claim that race is a useful proxy to best capture that genetic diversity—a proxy that is especially useful in clinical settings. By 2005, for example, Francis Collins had shifted from criticisms of the race concept to advocating the need to study how genetic variation and disease risk are correlated with what he called “self-identified race, and how we can use that correlation to reduce the risk of people getting sick.”21 This paradox is embedded in the practice of genomics in the twenty-first century. Self-reported racial identity remains an essential variable used at all stages of genetic research.22
The first major controversy related to the race concept and the Human Genome Project was the Human Genome Diversity Project (HGDP), which was proposed in the early 1990s by leading evolutionary biologists and population geneticists as a “resource that is aimed at promoting worldwide research on human genetic diversity, with the ultimate goal of understanding how and when patterns of diversity formed.” Project organizers believed that the genetic information garnered from it would likely “prove useful to several areas of biomedical research.” Data from the HGDP, it was hoped, could help estimate the incidence of recessive genetic diseases around the world, help identify the genetic variants that contribute to disease, and examine the “contributions of environmental factors to complex human disease.”23
In order to accomplish its goals, the project collected DNA samples from thousands of populations across the globe, including the world’s remaining indigenous populations. And it was these sampling methods and the language used to describe indigenous groups that fueled opposition to the project in some corners of academia and among indigenous rights groups. Even though project organizers were decidedly antiracist—early on, some project scientists sought to sever ties with the historic use of the race concept in scientific studies of human populations by abandoning the category of race in their analysis, proposing to use categories of group and population instead24—critics accused HGDP scientists of racism and colonialism. Non-HGDP scientists were openly critical of the organization of the project, particularly of its framework for studying human populations. Although the HGDP saw as part of its mission to improve understanding of the diversity of non-Europeans (so as not to fall into the trap of seeing Europeans as genetically diverse and non-European groups as genetically homogeneous), the use of terms like “tribe” and “indigenous group” by project scientists left critics, including anthropologist Alan Swedlund, worried that the project might promote racism. Swedlund called the project “21st-century technology applied to nineteenth-century biology.”25 At the same time, indigenous groups worried that they were perceived as human fossils that needed to have their DNA sampled before they disappeared.26
The problem, as the sociologist of science Jenny Reardon has argued, is that the population genetics–based race concept can be both a social and a scientific idea, and that therefore not all HGDP scientists “shared the same understanding of racial categories.” Some “believed that scientists could continue to use racial categories as long as they properly limited their use” to scientific and medical research. Some project scientists accepted race only as a sociocultural concept. Other scientists advocated using traditional racial categories in conjunction with a more specific understanding of the population under study. And, finally, some participating scientists held contradictory views about the race concept, using it in some contexts and not in others.27
More recently, in 2002 the International HapMap Project set out to “determine the common patterns of DNA sequence variation in the human genome” in order to develop “a map of these patterns across the genome.” The map, using DNA samples from “populations with ancestry from parts of Africa, Asia and Europe,” would be used to determine “the genotypes of one million or more sequence variants, their frequencies and the degree of association between them.” The HapMap Project, organizers hope, “will allow the discovery of sequence variants that affect common disease, will facilitate development of diagnostic tools, and will enhance our ability to choose targets for therapeutic intervention.”28
HapMap organizers have insisted, much like HGDP scientists did, that the project is not about measuring racial differences in the hope of uncovering disease-related information (and then associating certain racial groups with certain diseases) but rather represents the belief that human genetic variation is key to understanding the distribution of disease across human populations.29 The HapMap’s Web site, for example, asserts that “the information emerging from the Project is helping to demonstrate that common ideas about race emerge largely from social and cultural interactions and are only loosely connected to biological ancestry.”30 The website’s “Guidelines for Referring to the HapMap Populations in Publications and Presentations” even warns that describing study populations “in terms that are too broad could result in inappropriate over-generalization.” This may, in turn, “erroneously lead those who interpret HapMap data to equate geography (the basis on which populations were defined for the HapMap) with race (an imprecise and mostly socially constructed category).”31
Still, despite these calculated efforts to separate human genetic diversity research from past racial science, some HapMap critics believe that the project risks recapitulating typological approaches to race by relying on geographic ancestries—the very European, Asian, and African categories that are central to the data-collection methods of the HapMap. As legal scholar Jennifer Hamilton has pointed out, such “taxonomies of geographical ancestry reflect familiar divisions that map rather neatly onto earlier racial taxonomies (e.g., Negro, Caucasian, Mongol; Africanus, Europeanus, Asiaticus).”32
The emerging area of personalized medicine provides a third example of the intersection of race and science in the genomic age. This new field claims that the best treatments are individualized ones based on an individual’s genome. A goal of personalized medicine is that people will have their own genomes sequenced, and that an analysis of that data will provide doctors with information both about disease risk and about an individual’s pharmacogenomic profile (how one’s genes influence responses to drugs). We know, for example, that a group of genes known as cytochrome P450 (the CYP family of genes) play an important role in the metabolism of most clinically used drugs.33 We also know that differences, or polymorphisms, in the sequence of these genes can alter the clinical responses to drugs. Some of these variations can lead to toxic reactions, while others can impact a drug’s efficacy. Identifying these differences is thus a clinically useful and sometimes lifesaving tool. The challenge, however, lies in determining individual pharmacogenomic profiles, and this is where the race concept again intersects with the genomic age. Because we do not yet have the technology where this can occur (in most cases) rapidly and economically, researchers have been turning to racial and ethnic profiles as a proxy for estimating individual risks. For example, the CYP2C9 allele (or gene variant), which mediates metabolism of the anticoagulant drug Warfarin, shows up in higher frequency in what one study refers to as Black and Caucasian populations but is extremely rare in East Asian populations.34 The CYP2C9 pharmacogenomic literature is rife with such studies examining alleles in racial, ethnic, and national groups, including, for example, “Swedes,” “African Americans,” “Han Chinese,” and “inner-city Hispanics.”35
As Dobzhansky has taught us, populations have varying allele frequencies in all their genes. That concept was the basis of the shift from a typological (or fixed) understanding of racial differences to an evolutionary synthesis understanding of racial difference, which understood population-level differences in terms of variation of allele frequencies between racial groups. Remember also that Dobzhansky argued that how we choose to arrange these groups was a methodological decision rather than a reflection of the natural order of things. This basic fact of population and evolutionary biology has led some to be highly critical of the race-based approach to personalized medicine.
To highlight the complexity of gene frequencies being dynamic both within and between populations, Craig Venter and his colleagues used the published gene sequences of two self-identified Caucasian men—himself and James Watson—to show how two white individuals could have very different pharmacogenomic profiles. Venter, for example, is an “extensive metabolizer” of CYP2D6, known to be involved in metabolizing codeine, antipsychotics, and antidepressants. Watson, on the other hand, has a variant of that gene that makes him an “intermediate metabolizer” of this class of drugs. From a race-based pharmacogenomics perspective, Venter’s genetic profile was predictable. Watson’s allele, however, is rare among self-identified Caucasians but seen in high frequency in East Asian populations. If one were to have used race alone to assess his ability to metabolize this class of drugs, Watson would not have received the proper care. Venter and his colleagues argue that this and other examples illustrate why the best choice is looking directly at an individual’s “genomic sequence instead of relying on a patient’s appearance or self-identified ethnicity.” Venter and his team also point out that gene variants do not always behave as expected: “Although these complications may be due to other genetic differences, cultural factors such as diet and environment can also influence drug response.”36
What is interesting in the context of the history of the race concept, however, is how these examples—the HGDP, the HapMap Project, and pharmacogenomics—illustrate similar patterns of racial research in the genomic age. That is, that scientists readily claim the race concept is both a reasonable proxy for genetic diversity even as they recognize its limited utility as a classificatory tool. More recently, two meetings sponsored by the National Human Genome Research Institute in 2007 and 2008 sought to address the current challenges of studying human genetic diversity, particularly given the burgeoning interest in the relationship between self-identified race and health disparities. At the 2007 conference, “Frontiers in Population Genomics Research Meeting,” one speaker called for scientists to “eliminate reliance on the construct of race” and for population geneticists to “engage populations that adequately reflect human diversity.”37 The 2008 meeting, “Understanding the Role of Genomics in Health Disparities: Toward a New Research Agenda,” was jointly sponsored by the National Human Genome Research Institute, the National Cancer Institute, and the National Center on Minority Health and Health Disparities of the National Institutes of Health (NIH). Attendees, representing many NIH institutes as well as natural and social scientists with a research interest in disparities, made several recommendations based on two days of lectures and discussions.
In an apparent attempt to, at the very least, address how the race concept is used in scientific studies, the group recommended that NIH Funding Opportunity Announcements “should require that investigators justify their use of racial and ethnic categories relative to the questions that they are asking and the methods they are using in particular research.” The group also recommended the development of “statistical tools to analyze complex populations as a single unit based on genomic/molecular profile (or characteristics) rather than stratify by race and ethnicity.” The report remains unpublished and it is unclear what, if any, impact its recommendations will have.38 Despite the best intentions of such reports, it has been over fifty years since Dobzhansky issued his challenge to the field in 1962—he said then that “the problem that now faces the science of man is how to devise better methods for further observations that will give more meaningful results”—and we are still struggling with the meaning of race in biology.39
A more immediate problem is that for now, despite some trying to find useful solutions to these challenges, the NIH still reifies racial categories in its grant applications; scientists are required to describe recruitment strategies for human subjects that emphasize traditional racial classifications. Scientists working at or funded by the NIH are mandated to report race based on the U.S. Census categories and following the standards set forth by the White House Office of Management and Budget Directive No. 15, which “defines minimum standards for maintaining, collecting and presenting data on race and ethnicity for all Federal reporting.”40 The “Targeted/Planned Enrollment Table,” a fixture in all NIH grant applications that include human subjects, divides humanity into five racial groups: “American Indian/Alaska Native,” “Asian,” “Native Hawaiian or Other Pacific Island,” “Black or African American,” and “White.” Although the form was recently updated to include categories for “more than one race” and “unknown or not reported,” it still reinforces the most antiquated notions of this fundamentally flawed concept, making it more difficult to explore the subtlety and complications inherent to human genetic diversity.41 So long as the NIH is obligated to follow Office of Management and Budget rules, the system will perpetuate this racial paradigm.
While the NIH meetings were a call to action for the field, some have more directly attempted to address Dobzhansky’s challenge to devise better methods to understand human diversity. The molecular epidemiologist Timothy Rebbeck and bioethicist Pamela Sankar at the University of Pennsylvania, for example, acknowledge that “much of the persistent controversy over the use of the terms ‘ethnicity,’ ‘ancestry,’ or ‘race’ may be attributable to the imprecision of their use.” They believe that defining the terminology of race on a study-by-study basis may be a way to simultaneously embrace the inconsistencies of the race concept and find meaning in its specific context. Yet ultimately, they recognize that as a measure, race “may have utility in increasing study efficiency or reducing confounding,” but “an important future direction for research will be to develop new measures that correlate with SIRE [self-identified race or ethnicity] that may better reflect the complex nature of this variable.”42
If these struggles over race at the outset of the genomic age can teach us anything about the history of the race concept, it is that even a generation of scientists reared in the wake of the evolutionary synthesis and population genetics (who had been trained to reject typology as a component of their analysis of populations) still struggle with utilizing a concept that has such a contradictory and sometimes awful history. These scientists—contradictions and all—indeed reflect and are working in the traditions of Dobzhansky and the evolutionary synthesis.