2
E pluribus unum

What we call the beginning is often the end
And to make an end is to make a beginning.
The end is where we start from.
T. S. Eliot, ‘Little Gidding’ (Four Quartets)

The study of human diversity was, until the twentieth century, limited to variation that could be observed with the naked eye. The subject of countless studies by Broca, Galton and the biometricians in Europe and America, this era marked a ‘collection’ phase of physical anthropology – the early stages of a new field of scientific enquiry, when there is no unifying theory with which to analyse the data accumulated. There was only one problem with the growing mass of data on human morphological variation – there was no simple correspondence between the newly rediscovered laws of heredity and the characters being measured. While there is certainly a genetic component to human morphology, it is clear that dozens – probably hundreds – of separate genes control this variability. Even today, the underlying genetic causes have yet to be deciphered. Thinking of Broca’s craniometric studies, if a particular bump on the skull is found in two unrelated individuals, does it necessarily represent the same genetic change? Are the bumps really the same characteristic, and thus representative of a true genetic relationship, or do they simply resemble each other superficially – by chance? It was impossible to know.

Genetic variation was critical for the study of human diversity because is it is genetic change that actually produces evolution. At its most basic level, evolution is simply a change in the genetic composition of a species over time. Thus in order to assess how closely related individuals are – in particular whether they form a single species – it is important to know something about their genes. If the genes are the same, then they are the same species. What physical anthropology desperately needed was a collection of varying traits – known as polymorphisms, from the Greek for ‘many forms’ – with a simple pattern of inheritance. These could then be used to study human diversity in an effort to categorize it. Some traits like this were already known, particularly diseases like haemophilia. The problem with disease-causing polymorphisms was that they were simply too rare to be of any use in classification. Common, genetically simple polymorphisms were critical.

These arrived in 1901, when Karl Landsteiner noticed an interesting reaction upon mixing the blood from two unrelated people: some of the time it clumped together, forming large clots. This coagulation reaction was shown to be heritable, and it constituted the first demonstration of biochemical diversity among living humans. This experiment led to the definition of human blood groups, which would soon be applied to transfusions all over the world. If your doctor tells you that you have type A blood, this is actually the name given by Landsteiner to the first blood group polymorphism over a century ago.

Building on Landsteiner’s insight, a Swiss couple named Hirszfeld began to test the blood of soldiers in Salonika during the First World War. In a 1919 publication, they noted different frequencies of blood groups among the diverse nationalities thrown together by the hostilities – the first direct survey of human genetic diversity. The Hirszfelds even formulated a theory (accepted by some to this day) in which the A and B blood groups represent the traces of ‘pure’ populations of aboriginal humans, each composed entirely of either A or B individuals. These pure races later became mixed through migration, leading to the complicated patterns of A and B seen in their study. They failed to explain how the two races may have arisen, but given that group A was thought to have originated in northern Europe, while B was a southern marker at highest frequency in India, it seems that there must have been two independent origins of modern humans.

In the 1930s an American named Bryant and an Englishman named Mourant, building on the work of the Hirszfelds, began to test blood samples from around the world. Over the next thirty years these two men and their colleagues would examine thousands of people, from hundreds of populations, both living and dead. Bryant and his wife (like the Hirszfelds, another of the marital duos in population genetics) even went so far as to test American and Egyptian mummies, establishing the ancient nature of the ABO polymorphisms. In 1954 Mourant drew together the rapidly expanding body of blood group data in the first comprehensive summary of human biochemical diversity, The Distribution of the Human Blood Groups – a seminal work that became the standard text of experimental human population genetics for the next twenty years. This was the beginning of the modern era of human genetics.

While the Hirszfelds clearly felt that their data on blood groups supported a racial classification that had become blurred by recent migration, and Carleton Coon later used them to support his theories of discrete subspecies, no one had actually tested the genetic data to see if there was any real indication of racial subdivision. This obvious analysis was finally carried out in 1972 by a geneticist whose primary research interest, oddly enough, was fruit flies – not humans.

Using the data collected by Mourant and others, Richard Lewontin, then a professor at the University of Chicago, performed a seemingly trivial study of how human genetic variation sorted into within- versus between-group components. The question he was tying to answer, objectively, was whether there was any indication in the genetic data of a distinct subdivision between human races. In other words, he was directly testing the hypotheses of Linnaeus and Coon about human subspecies. If human races showed significant differences in their patterns of genetic diversity, then Linnaeus and Coon must be right.

Lewontin describes the development of the analysis:

The paper was written in response to a request … to contribute an article to the new journal Evolutionary Biology. I had been thinking at that time about diversity measures … not in the context of population genetics, but in the context of ecology. I had to take a very long bus trip to Bloomington, Indiana, and I had long had the habit, when going on trains and buses, of writing papers. I needed to write this paper, so I went on the bus trip with a copy of Mourant and a table of plnp [a mathematical table used for calculating the diversity measure]. On this bus trip, he began what would become one of the landmark studies in human genetics. In the analysis, Lewontin used as his model the new science of biogeography (the study of animal and plant geographic distributions) because he thought this was analogous to what he was doing with humans – looking for geographic subdivisions in order to define race. In fact, unsure of how to define a ‘race’ objectively, he divided humans largely along geographical lines – Caucasians (western Eurasia), Black Africans (sub-Saharan Africa), Mongoloids (east Asia), South Asian Aborigines (southern India), Amerinds (Americas), Oceanians and Australian Aborigines.

The surprising result he obtained was that the majority of the genetic differences in humans were found within populations – around 85 per cent of the total. A further 7 per cent served to differentiate populations within a ‘race’, such as the Greeks from the Swedes. Only 8 per cent were found to differentiate between human races. A startling conclusion – and clear evidence that the subspecies classification should be scrapped. Lewontin says of the result:

I had no expectation – I honestly didn’t. If I had any prejudice, it probably was that the between-race difference would have been a lot larger. This was reinforced by the fact that, when my wife and I were in Luxor [Egypt], years before it was overrun with tourists, she got in a discussion with a guy in the lobby. He was talking to her as if he knew her. She kept saying ‘I’m sorry, sir, you’ve mistaken me for someone else.’ Finally he said, ‘Oh, I’m sorry madam – you all look alike to me.’ That really had a big effect on my thinking – they really are different from us, and we’re all alike.

But the result was there in the statistical analysis, and it has been confirmed by many other studies over the past three decades. The small proportion of the genetic variation that distinguishes between human populations has been debated endlessly (is it higher within or between races?), but the fact remains that a small population of humans still retains around 85 per cent of the total genetic diversity found in our species. Lewontin likes to give the example that if a nuclear war were to happen, and only the Kikuyu of Kenya (or the Tamils, or the Balinese …) survived, then that group would still have 85 per cent of the genetic variation found in the species as a whole. A strong argument indeed against ‘scientific’ theories of racism – and clear support for Darwin’s assessment of human diversity in the 1830s. It really was a case of ‘out of many, one’, as the title of this chapter says in Latin. But does this mean that the study of human groups is meaningless – can genetics really tell us anything about human diversity?

Forcing the issue

For the next step on our journey, we need to cover some basic population genetics. The theory of how genes in a population behave over time is fairly complicated, and makes use of many related branches of quantitative science. Statistical mechanics, probability theory and biogeography have all contributed to our understanding of population genetics. But many of the theoretical frameworks are based on a few key concepts that can be understood by anyone, reflecting the relative simplicity of the forces involved.

The most basic force is mutation, and without it polymorphism would not exist. By mutation I mean a random change in a DNA sequence – these occur at a rate of around thirty per genome per generation. Looking at it another way, each person alive today is carrying around thirty completely novel mutations that distinguish them from their parents. Mutations are random because they arise as copying mistakes during the process of cell division, with no particular rhyme or reason as to where those mistakes might occur – our genomes do not appear to favour certain types of mutation based on what the effect might be. Rather, we are like Heath Robinson engineers, forced to make use of what we are given in the mutational lottery. The blood group variants discovered by Landsteiner originated as mutations, as do all other polymorphisms.

The second force is known as selection, in particular natural selection. This is the force that Darwin got so excited about, and it has certainly played a critical role in the evolution of Homo sapiens. Selection acts by favouring certain traits over others by conferring a reproductive advantage on their bearers. For example, in cold climates animals with thick fur would have an advantage over hairless ones, and their offspring would be more likely to survive. Selection is certainly what has made us the sentient, cultured apes we are today. It is what produced the important traits of speech, bipedalism and opposable thumbs. Without natural selection we would still be very similar to the relatively unsophisticated ape-like ancestor we would encounter if we could go back in time 5 million years or so.

The third force is known as genetic drift. This is a rather specialized term for something we have an innate sense of – the tendency of small samples to reflect a biased view of the population from which they are drawn. If you flip a coin 1,000 times, you expect to get around 500 heads and 500 tails. If, on the other hand, you flip the coin only 10 times, it is quite likely that you will get something other than a 5–5 outcome – perhaps 4–6 or 7–3. This random fluctuation in a sampled group is due to the small number of individual events in the sample. If we think of people as genetically sampled ‘events’, and assume that the population from which we will draw the sample for the next generation is created anew in the present generation (as is the case for living organisms), then you can see that small population sizes can lead to drastic changes in gene frequency within only a few generations. In the case of our coin flippers, getting a result of 7–3 would be reflected in the likelihood of flipping that number in the next generation, with a 70 per cent chance of getting heads and 30 per cent of getting tails. It’s like a ratchet, because the probability change in the previous generation affects the probability in the subsequent generations. In the coin-flipping analogy, we’ve gone from a frequency of 50 per cent to 70 per cent in a single generation – a pretty rapid change. Clearly, drift can have a huge effect on gene frequencies in small populations.

The combination of these three forces has produced the dizzying array of genetic patterns we see today – and the vast diversity we see in human populations. Their action has also produced the small percentage of human variation that distinguishes between human groups. That much was known by the middle of the twentieth century. But simply recognizing the existence of human diversity at a biochemical level, and knowing something about the way genes behave in populations, didn’t really say much about the details of human evolution and migration. Enter an Italian physician with a historical bent and a talent for mathematics, who came to the field influenced by a new way of thinking about bacteria and flies.

The Italian job

Luigi Luca Cavalli-Sforza had started his career in Pavia as a medical student. He soon left medicine to devote himself to genetics research, first on bacteria and later on humans. At university he had studied under the famous Drosophila geneticist Buzzati-Traverso, who was an adherent of the Dobzhansky school of genetics. Theodosius Dobzhansky had also been Richard Lewontin’s PhD supervisor, and the story therefore begins to show a common thread. The main theme of Dobzhansky’s research was the study of genetic variation, in particular large-scale chromosomal rearrangements in fruit flies. He pioneered techniques in genetic analysis, and his laboratory in New York was to be the epicentre of a revolution in biology during the mid-twentieth century. Dobzhansky and his students advocated a new view of genetic variation in which there was no division into an optimized ‘wild type’ (the normal form of the organism, created through a long period of natural selection) and a quirky ‘mutant’, invariably disadvantaged in some way. This was too simplistic, they thought – primarily because there was simply too much variation to account for if most of the mutants were carrying a suboptimal genetic package. If, instead, one thought of variation as the normal state of species, then evolution suddenly made much more sense. There was a previously unrecognized reservoir of different types on which evolution could act – favouring some in one case, losing them in another.

So, with a thorough background in the seemingly disparate fields of fruit-fly variation and medicine, Cavalli-Sforza began to conduct studies of blood polymorphisms – later termed ‘classical’ polymorphisms by geneticists – in an effort to assess the relationships among modern humans. This work was begun in the 1950s, a heady time for the field of genetics. The structure of DNA had just been deciphered by Crick and Watson, and the application of the methodology of the physical sciences promised a revolution in biology. Like most geneticists, Cavalli-Sforza made use of the rapidly developing techniques of biochemistry to assay variation. But unlike many of them, he was also comfortable with the application of mathematics – particularly its most pragmatic branch, statistics. The dizzying variety of the data being generated by studies of polymorphisms needed a coherent theoretical framework to make it understandable. And statistics was about to ride to the rescue.

Imagine a group of anything that exhibits variation – the different colours of stones in a streambed, snail-shell size, fruit-fly wing length, or human blood groups. At first glance these variations seem random and disconnected. If we have multiple sets of such objects, then it seems more complex still – even chaotic. What does it reveal about the mechanism by which the diversity was generated?

The knee-jerk reaction of most biologists in the 1950s to any pattern of diversity in nature was that selection was the root cause. Human diversity was no exception, as the eugenicists made quite clear. In part this stemmed from the widespread belief in ‘wild types’ and ‘mutants’. The wild type could encompass any trait – size, colour, nose shape, or any other ‘normal’ characteristic of the organism. This was reinforced by the fact that genetic diseases (which were clearly ‘abnormal’) were some of the first variants recognized in humans, setting the stage for a worldview in which people were categorized as fit or unfit according to a Darwinian evolutionary struggle. However, in the 1950s Motoo Kimura, a Japanese scientist working in the United States, began to do some genetic calculations using methods originally derived for analysing the diffusion of gases, formalizing work carried out by Cavalli-Sforza and others. This work would eventually lead the field out of the ‘mutant’ morass.

Kimura noticed that genetic polymorphisms in populations can vary in frequency owing to random sampling errors – the ‘drift’ mentioned above. What was exciting in his results was that drift seemed to change gene frequencies at a predictable rate. The difficulty with studying selection was that the speed with which it produced evolutionary change depended entirely on the strength of selection – if the genetic variant was extremely fit, then it increased in frequency rapidly. However, it was virtually impossible to measure the strength of selection experimentally, so no one could make predictions about the rate of change. In our coin-flipping example, if heads is one variant of a gene and tails is another, then the increase in frequency from 50 per cent to 70 per cent in a single ‘generation’ would imply very strong selection favouring heads. Clearly, though, this isn’t the case – heads increased to 70 per cent for reasons that had nothing to do with how well adapted it was.

Kimura’s insight was that most polymorphisms appear to behave like this – that is they are effectively free from selection, and thus they can be treated as evolutionarily ‘neutral’, free to drift around in frequency due entirely to sampling error. There has been great debate among biologists about the fraction of polymorphisms that are neutral – Kimura and his scientific followers thought that almost all genetic variation was free from selection, while many scientists continue to favour a significant role for natural selection. Most of the polymorphisms studied by human geneticists, though, had probably arrived at their current frequencies because of drift. This opened the door to a new way of analysing the rapidly accumulating data on blood group polymorphisms. But before that could happen, the field needed to make a quick detour through the Middle Ages.

‘Ock the Knife’

William of Ockham (c.1285–1349) was a medieval scholar who must have been a nightmare to be around. Ockham believed literally in Aristotle’s statement that ‘God and nature never operate superfluously, but always with the least effort’, and took every opportunity to invoke his interpretation of this view in arguments with his colleagues. Ockham’s razor, as it became known, was stated quite simply in Latin: Pluralitas non est ponenda sine necessitate (plurality is not to be posited without necessity). In its most basic form, Ockham’s statement is a philosophical commitment to a particular view of the universe – a view that has become known as parsimony. In the real world, if each event occurs with a particular probability, then multiple events occur with multiplied probabilities and, overall, the complex events are less likely than the simple ones. It is a way of breaking down the complexity of the world into understandable parts, favouring the simple over the absurd. I may actually fly from Miami to New York via Shanghai – but it is not terribly likely.

This may seem trivial when applied to my travel schedule, but it is not so obvious when we start to apply it to the murky world of science. How do we really know that nature always takes the most parsimonious path? In particular, is it self-evident that ‘simplify’ is nature’s buzzword? This book is not the forum for a detailed discussion of the history of parsimony (there are several references in the bibliography where the subject is discussed in great detail), but it seems that nature usually does favour simplicity over complexity. This is particularly true when things change – like when a stone drops from a cliff to the valley below. Gravity clearly exerts itself in such a way that the stone moves directly – and rather quickly – from the high to the low point, without stopping for tea in China.

So, if we accept that when nature changes, it tends to do so via the shortest path from point A to point B, then we have a theory for inferring things about the past. This is quite a leap, since it implies that by looking at the present we can say something about what happened before. In effect, it provides us with a philosophical time machine with which to travel back and dig around in a vanished age. Pretty impressive stuff. Even Darwin was an early adherent – Huxley actually scolded him on one occasion for being such a stick-in-the-mud about his belief that natura non facit saltum (nature doesn’t make leaps).

The first application of parsimony to human classification was published by Luca Cavalli-Sforza and Anthony Edwards in 1964.* In this study they made two landmark assumptions which would be adopted in each subsequent study of human genetic diversity. The first was that the genetic polymorphisms were behaving as Kimura had predicted – in other words, they were all neutral, and thus any differences in frequency were due to genetic drift. The second assumption was that the correct relationship among the populations must adhere to Ockham’s rule, minimizing the amount of change required to explain the data. With these key insights, they derived the first family tree of human groups based on what they called the ‘minimum evolution’ method. In effect, this means that the populations are linked in a diagram such that the ones with the most similar gene frequencies are closest together, and that overall the relationship among the groups minimizes the total magnitude of gene frequency differences.

Cavalli-Sforza and Edwards looked at blood group frequencies from fifteen populations living around the world. The result of this analysis, laboriously calculated by an early Olivetti computer, was that Africans were the most distant of the populations examined, and that European and Asian populations clustered together. It was a startlingly clear insight into our species’ evolutionary history. As Cavalli-Sforza says modestly, the analysis ‘made some kind of sense’, based on their concept of how human populations should be related – European populations were closer to each other than they were to Africans, New Guineans and Australians grouped together, and so on. This was a reflection of similarities in gene frequencies, and since these frequencies changed in a regular way over time (remember genetic drift), it meant that the time elapsed since Europeans started diverging from each other was less than the time separating Europeans from Africans. The old monk had proven useful after 700 years – and anthropology had a way forward.*

With this new approach to human classification, it was even possible to calculate the dates of population splits, making several assumptions about the way humans had behaved in the past, and the sizes of the groups they lived in. This was first done by Cavalli-Sforza and his colleague Walter Bodmer in 1971, yielding an estimate of 41,000 years for the divergence between Africans and East Asians, 33,000 for Africans and Europeans and 21,000 for Europeans and East Asians. The problem was, it was uncertain how reasonable their assumptions about population structure really were. And crucially, it still failed to provide a clear answer to the question of where humans had originated. What the field needed now was a new kind of data.

Alphabet soup

Emile Zuckerkandl was a German-Jewish émigré working at the California Institute of Technology in Pasadena. He spent much of his scientific career tenaciously focused on one problem: the structure of proteins. Working with the Nobel Prize-winning biochemist Linus Pauling in the 1950s and 60s, Zuckerkandl studied the basic structure of the oxygen-carrying molecule haemoglobin – chosen because it was plentiful and easy to purify. Haemoglobin had another important characteristic: it was found in the blood of every living mammal.

Proteins are composed of a linear sequence of amino acids, small molecular building blocks that combine in a unique way to form a particular protein. The amazing thing about proteins is that, although they do their work twisted into baroque shapes, often with several other proteins sticking to them in a complex way, the ultimate form and function of the active protein is determined by a simple linear combination of amino acids. There are twenty amino acids used to make proteins, with names like lysine and tryptophan. These are abbreviated by chemists to single letter codes – K and Y in this case.

Zuckerkandl noticed an interesting pattern in these amino acid sequences. As he started to decipher haemoglobins from different animals, he found that they were similar. Often they had identical sequences for ten, twenty, or even thirty amino acids in a row, and then there would be a difference between them. What was fascinating was that the more closely related the animals were, the more similar they were. Humans and gorillas had virtually identical haemoglobin sequences, differing only in two places, while humans and horses differed by fifteen amino acids. What this suggested to Zuckerkandl and Pauling was that molecules could serve as a sort of molecular clock, documenting the time that has elapsed since a common ancestor through the number of amino acid changes. In a paper published in 1965, they actually refer to molecules as ‘documents of evolutionary history’. In effect, we all have a history book written in our genes. According to Zuckerkandl and Pauling, the pattern written in a molecular structure can even provide us with a glimpse of the ancestor itself, making use of Ockham’s razor to minimize the number of inferred amino acid changes and working back to the likely starting point (see Figure 1). Molecules are, in effect, time capsules left in our genomes by our ancestors. All we have to do is learn to read them.

Figure 1 The evolutionary ‘genealogy’ of two related molecules, showing sequence changes accumulating on each lineage.

Of course, Zuckerkandl and Pauling realized that proteins were not the ultimate source of genetic variation. This honour lay with DNA, the molecule that actually forms our genes. If DNA encodes proteins (which it does), then the best molecule to study would be the DNA itself. The problem was that DNA was extremely difficult to work with, and getting a sequence took a long time. In the mid-1970s, however, Walter Gilbert and Fred Sanger independently developed methods for rapidly obtaining DNA sequences, for which they shared the Nobel Prize in 1977. The ability to sequence DNA set off a revolution in biology that has continued to this day, culminating in 2000 with the completion of a working draft of the entire human genome sequence. DNA research has revolutionized the way we think about biology, so it isn’t surprising that it has had a significant effect on anthropology as well.

The crowded garden

So we find ourselves in the 1980s with the newly developed tools of molecular biology at our disposal, a theory for how polymorphisms behave in populations, a way to estimate dates from molecular sequence data and the burning question of how genetics can answer a few age-old questions about human origins. What the field needed now was a lucky insight and a bit of chutzpah. Both of these were to be found in the early 1980s in the San Francisco Bay area of northern California.

Allan Wilson was an Australian biochemist working at the University of California, Berkeley, on methods of evolutionary analysis using molecular biology – the new branch of biology that focused on DNA and proteins. Using the methods of Zuckerkandl and Pauling, he and his students had used molecular techniques to estimate the date of the split between humans and apes, and they had also deciphered some of the intricate details of how natural selection can tailor proteins to their environments. Wilson was an innovative thinker, and he embraced the techniques of molecular biology with a passion.

One of the problems that molecular biologists encountered in studying DNA sequences was that of the duplicate nature of the information. Inside each of our cells, what we think of as our genome – the complete DNA sequence that encodes all of the proteins made in our bodies, in addition to a lot of other DNA that has no known function – is really present in two copies. The DNA is packaged into neat, linear components known as chromosomes – we have twenty-three pairs of them. Chromosomes are found inside a cellular structure known as the nucleus. One of the main features of our genome is the astounding compartmentalization – like computer folders within folders within folders. In all there are 3,000,000,000 (3 billion) building blocks, known as nucleotides (which come in four flavours: A, C, G and T), in the human genome, and we need some way to get at all of the information it contains in a straightforward way. This is why we have chromosomes, and why they are kept squirrelled away from the rest of the cell inside the nucleus.

The reason we have two copies of each chromosome is more complicated, but it comes down to sex. When a sperm fertilizes an egg, one of the main things that happens is that part of the father’s genome and part of the mother’s genome combine in a 50 : 50 ratio to form the new genome of the baby. Biologically speaking, one of the reasons for sex is that it generates new genomes every generation. The new combinations arise, not only at the moment of conception with the 50 : 50 mixing of the maternal and paternal genomes, but also prior to that, when the sperm and egg themselves are being formed. This pre-sexual mixing, known as genetic recombination, is possible because of the linear nature of the chromosomes – it is relatively easy to break both chromosomes in the middle and reattach them to their partners, forming new, chimeric chromosomes in the process. The reason why this occurs, as with the mixing of Mum’s and Dad’s DNA, is that it is probably a good thing, evolutionarily speaking, to generate diversity in each generation. If the environment changes, you’ll be ready to react.

But wait, you might say, why are these broken and reattached chromosomes any different from the ones that existed before? They were supposed to be duplicates! The reason, quite simply, is that they aren’t exact copies of each other – they differ from each other at many locations along their length. They are like duplicates of duplicates of duplicates of duplicates, made with a dodgy copying machine that introduces a small number of random errors every time the chromosomes are copied. These errors are the mutations mentioned above, and the differences between each chromosome in a pair are the polymorphisms. Polymorphisms are found roughly every 1,000 nucleotides along the chromosome, and serve to distinguish the chromosomes from each other. So, when recombination occurs, the new chromosomes are different from the parental types.

The evolutionary effect of recombination is to break up sets of polymorphisms that are linked together on the same piece of DNA. Again, this diversity-generating mechanism is a good thing evolutionarily speaking, but it makes life very difficult for molecular biologists who want to read the history book in the human genome. Recombination allows each polymorphism on a chromosome to behave independently from the others. Over time the polymorphisms are recombined many, many times, and after hundreds or thousands of generations, the pattern of polymorphisms that existed in the common ancestor of the chromosomes has been entirely lost. The descendant chromosomes have been completely shuffled, and no trace of the original deck remains. The reason this is bad for evolutionary studies is that, without being able to say something about the ancestor, we cannot apply Ockham’s razor to the pattern of polymorphisms, and we therefore have no idea how many changes really distinguish the shuffled chromosomes. At the moment, all of our estimates of molecular clocks are based on the rate at which new polymorphisms appear through mutation. Recombination makes it look like there have been mutations when there haven’t, and because of this it causes us to overestimate the time that has elapsed since the common ancestor.

One of the insights that Wilson and several other geneticists had in the early 1980s was that if we looked outside of the genome, at a small structure found elsewhere in the cell known as the mitochondrion, we might have a way of cheating the shuffle. Interestingly, the mitochondrion has its own genome – it is the only cellular structure other than the nucleus that does. This is because it is actually an evolutionary remnant from the days of the first complex cells, billions of years ago – the mitochondrion is what remains of an ancient bacterium which was swallowed by one of our single-celled ancestors. It later proved useful for generating energy inside the cell, and now serves as a streamlined sub-cellular power plant, albeit one that started life as a parasite. Fortunately, the mitochondrial genome is present in only one copy (like a bacterial genome), which means that it can’t recombine. Bingo. It also turns out that, instead of having one polymorphism roughly every 1,000 nucleotides, it has one every 100 or so. To make evolutionary comparisons we want to have as many polymorphisms as possible, since each polymorphism increases our ability to distinguish between individuals. Think of it this way: if we were to look at only one polymorphism, with two different forms A and B, we would sort everyone into two groups, defined only by variant A or variant B. On the other hand, if we looked at ten polymorphisms with two variants each, we would have much better resolution, since the likelihood of multiple individuals having exactly the same set of variants is much lower. In other words, the more polymorphisms we have, the better our chances of inferring a useful pattern of relationships among the people in the study. Since polymorphisms in mitochondrial DNA (mtDNA) are ten times more common than in the rest of our genome, it was a good place to look.

Rebecca Cann, as part of her PhD work in Wilson’s laboratory, began to study the pattern of mtDNA variation in humans from around the world. The Berkeley group went to great lengths to collect samples of human placentas (an abundant source of mtDNA) from many different populations – Europeans, New Guineans, Native Americans and so on. The goal was to assess the pattern of variation for the entire human species, with the aim of inferring something about human origins. What they found was extraordinary.

Cann and her colleagues published their initial study of human mitochondrial diversity in 1987. It was the first time that human DNA polymorphism data had been analysed using parsimony methods to infer a common ancestor and estimate a date. In the abstract to the paper they state the main finding clearly and succinctly: ‘All these mitochondrial DNAs stem from one woman who is postulated to have lived about 200,000 years ago, probably in Africa.’ The discovery was big news, and this woman became known in the tabloids as Mitochondrial Eve – the mother of us all. In a rather surprising twist, though, she wasn’t the only Eve in the garden – only the luckiest.

The analysis performed by Cann and her colleagues involved asking how the mtDNA sequences were related to each other. In their paper they assumed that if two mtDNA sequences shared a sequence variant at a polymorphic site (say, a C at a position where the sequences had either a C or a T), then they shared a common ancestor. By building up a network of the mtDNA sequences – 147 in all – they were able to infer the relationships between the individuals who had donated the samples. It was a tedious process, and involved a significant amount of time analysing the data on a computer. What their results showed were that the greatest divergence between mtDNA sequences was actually found among the Africans – showing that they had been diverging for longer. In other words, Africans are the oldest group on the planet – meaning that our species had originated there.

Figure 2 Proof that modern humans originated in Africa – the deepest split in the genealogy of mtDNA (‘Eve’) is between mtDNA sequences from Africans, showing that they have been accumulating evolutionary changes for longer.

One of the features of the parsimony analysis used by Cann, Stoneking and Wilson to analyse their mtDNA sequence data is that it inevitably leads back to a single common ancestor at some point in the past. For any region of the genome that does not recombine – in this case, the mitochondrion – we can define a single ancestral mitochondrion from which all present-day mitochondria are descended. It is like looking at an expanding circle of ripples in a pond and inferring where the stone must have dropped – in the dead centre of the circle. The evolving mtDNA sequences, accumulating polymorphisms as they are passed from mother to daughter, are the expanding waves, and the ancestor is the point where the stone entered the water. By applying Zuckerkandl and Pauling’s methods of analysis, we can ‘see’ the single ancestor that lived thousands of years ago, and which has mutated over time to produce all of the diverse forms that exist today. Furthermore, if we know the rate at which mutations occur, and we know how many polymorphisms there are by taking a sample of human diversity from around the globe, then we can calculate how many years have elapsed from the point when the stone dropped – in other words, to the ancestor from whom all of the mutated descendants must have descended.

Crucially, though, the fact that a single ancestor gave rise to all of the diversity present today does not mean that this was the only person alive at the time – only that the descendant lineages of the other people alive at the same time died out. Imagine a Provencal village in the eighteenth century, with ten families living there. Each has its own special recipe for bouillabaisse, but it can only be passed on orally from mother to daughter. If the family has only sons, then the recipe is lost. Over time, we gradually reduce the number of starting recipes, because some families aren’t lucky enough to have had girls. By the time we reach the present century we are left with only one surviving recipe – la bouillabaisse profonde. Why did this one survive? By chance – the other families simply didn’t have girls at some point in the past, and their recipes blew away with the mistral. Looking at the village today, we might be a little disappointed at its lack of culinary diversity. How can they all eat the same fish soup?

Of course, in the real world, no one transmits a recipe from one generation to the next without modifying it slightly to fit her own tastes. An extra clove of garlic here, a bit more thyme there, and voilà! – a bespoke variation on the matrimoine. Over time, these variations on a theme will produce their own diversity in the soup bowls – but the recipe extinction continues none the less. If we look at the bespoke village today we see a remarkable diversity of recipes – but they can still be traced back to a single common ancestor in the eighteenth century, thanks to Ock the Knife. This is the secret of Mitochondrial Eve.

The results from the 1987 study by Cann and her colleagues were followed up by a more detailed analysis a few years later, and both studies pointed out two important facts: that human mitochondrial diversity had been generated within the past 200,000 years, and that the stone had dropped in Africa. So, in a very short period of time – at least in evolutionary terms – humans had spread out of Africa to populate the rest of the world. There were some technical objections to the statistical analysis in the papers, but more extensive recent studies of mitochondrial DNA have confirmed and extended the conclusions of the original analysis. We all have an African great-great … grandmother who lived approximately 150,000 years ago.

Darwin, in his 1871 book on human evolution The Descent of Man, and Selection in Relation to Sex, had noted that ‘in each great region of the world the living mammals are closely related to the extinct species of the same region. It is therefore probable that Africa was formerly inhabited by extinct apes closely allied to the gorilla and chimpanzee; and as these two species are now man’s nearest allies, it is somewhat more probable that our early progenitors lived on the African continent than elsewhere.’ In some ways this statement is incredibly far-sighted, since most nineteenth-century Europeans would have placed Adam and Eve in Europe or Asia. In other ways it is rather trivial, since apes originated in Africa around 23 million years ago, so if we go back far enough we are eventually bound to encounter our ancestors in this continent. The key is to give a date – and this is why the genetic results were so revolutionary.

Anthropologists such as Carleton Coon had argued for the origin of human races through a process of separate speciation events from ape-like ancestors in many parts of the world. This hypothesis became known as multiregionalism, and it persists in some anthropological circles even today. The basic idea is that ancient hominid, or humanlike, species migrated out of Africa over the course of the past couple of million years or so, establishing themselves in east Asia very early on, and then evolving in situ into modern-day humans – in the process creating the races identified by Coon. To understand why this theory was so widely accepted, we need to leave aside DNA for a while and rummage around in some old bones.

Dutch courage

Linnaeus named our species Homo sapiens, Latin for ‘wise man’, because of our uniquely well-developed intellect. Since the nineteenth century, however, it has been known that other hominid species existed in the past. In 1856, for instance, a skull was discovered in the Neander Valley of western Germany. In pre-Darwinian Europe the bones were originally thought to be the remains of a malformed modern human, but it was later found to be a widespread and distinct species of ancestral hominid, christened Neanderthal Man after the site of its discovery. This was the first scientific recognition of a human ancestor, and provided concrete evidence that the hominid lineage has evolved over time. By the end of the nineteenth century the race was well and truly on to find other ‘missing links’ between humans and apes. And in 1890 a doctor working for the Dutch East India company in Java hit the jackpot.

Eugène Dubois was obsessed with human evolution, and his medical appointment in the Far East was actually part of an elaborate plan to bring him closer to what he saw as the cradle of humanity. Born in 1858 in Eijsden, Holland, Dubois specialized in anatomy at medical school. By 1881, he had been appointed as an assistant at the University of Amsterdam, but he found academic life to be too confining and hierarchical. So, in 1887, he packed up his worldly belongings and convinced his wife to set off with him on a quest to find hominid remains.

Dubois believed that humans were most closely related to gibbons, a species of ape only found in the Indo-Malaysian archipelago. This was because of their skull morphology (lack of a massive, bony crest on the top and a flatter face than that found in other apes) and the fact that they sometimes walked erect on their hind legs – both reasonable enough pieces of evidence, he thought, to look for the missing link in south-east Asia. His first excavations in Sumatra yielded only the relatively recent remains of modern humans, orang-utans and gibbons, but when he turned his attention to Java his luck changed.

In 1890 Dubois was sifting through fossils recovered from a river-bank at Trinil, in central Java, when he found a rather odd skullcap. To him it looked like the remains of an extinct chimpanzee known as Anthropopithecus, although without benefit of a good anatomical collection for comparison (he was in a colonial outpost, after all) it was difficult to be certain. The following year, however, a femur recovered from the same location threw the specimen into a whole new light. The leg bone was clearly not from a climbing chimpanzee, but rather from a species that walked upright. His calculations of the cranial capacity, or brain size, of the new find, in combination with its upright stance, led him to make a bold leap of faith. He named the new species Pithecanthropus erectus, Latin for ‘erect ape-man’. This was the missing link everyone had been searching for.

The main objection to Dubois’ discovery – battled out in public debates and carefully worded publications over the next decade – was that there was very little evidence that the skull and femur (and a tooth that was later found at the site) had actually come from the same individual. They were excavated at different times, and the relationship between the soil layers from which they had been recovered was unknown. Later finds of Pithecanthropus did reveal the Trinil femur to be anomalous, and it seems likely that it actually belongs to a more modern human. The tooth may well be that of an ape. Despite this, and despite Dubois’ incorrect assertion that the remains proved that modern humans had originated in south-east Asia from gibbon-like ancestors, the discovery of the Trinil skullcap was a watershed event in anthropology. The Javanese ape-man was clearly a long-extinct human ancestor – one with a cranial capacity much lower than our own, but still far above the range seen in apes. Although he got it wrong in so many ways, Dubois had got it right where it counted.

The competition to find other hominid remains intensified in the early twentieth century, with the lion’s share of the activity focused on east Asia and Africa. The discovery of Pithecanthropus-like fossils in the 1920s and 30s at Zhoukoudian, China, showed that Dubois’ ape-man had been widespread in Asia. The uniting of the Zhoukoudian Sinanthropus (‘Peking man’) with Pithecanthropus (‘Java man’) in the 1950s provided the first clear evidence for a widespread, extinct species of hominid: Homo erectus. Bus the most amazing finds were to come from Africa, starting with the work of Raymond Dart in the 1920s.

In 1922 Dart was appointed Professor of Anatomy at the University of the Witwatersrand in South Africa. This must have come as a bit of a blow to the academically high-flying Australian (who was previously based in Britain), since ‘Wits’ at that time was a scientific backwater. Nonetheless, he set about building the foundation of an academic Department of Anatomy in the newly created university, which involved the establishment of a collection of anatomical specimens. He urged his students to send him material, and after one of them found a fossil baboon skull from a quarry at Taung, near Johannesburg, Dart felt that he was on to something interesting.

Up to this point, most fossilized human remains had come from Europe and Asia: Neanderthal, Peking Man, Java Man – all were found outside Africa. In 1921, however, a Neanderthal-like skull was unearthed in Northern Rhodesia (now Zambia), proving that Africa had an ancient hominid pedigree as well. Dart was well aware of this when he contacted the owner of the Taung quarry to send him additional samples of material. What he found in the first crates to arrive in the summer of 1924 was, to his great delight, the oldest human fossil yet discovered.

As he painstakingly picked off the compressed rubbish accumulated over aeons in the Taung cave, Dart revealed an ape-like face staring back at him. Its small size and intact milk teeth immediately gave it away as a child’s skull, and Dart’s estimate of its cranial capacity revealed it to be well within the normal range found in modern apes – around 500 cubic centimetres. What was odd about the find was the size of the canine teeth – much smaller than those of apes – and the location of the foramen magnum, which serves as a conduit for the spinal column in its connection to the brain: it was orientated downward in the fossil, like modern humans, rather than backward, as is the case in apes. To Dart, both of these features indicated that the Taung baby, as it became known, was no ordinary simian. In a 1925 paper he asserted that the skull represented the remains of a new species, which he called Australopithecus africanus (‘African southern ape’), that walked upright and used tools. In Dart’s own words, the Southern Ape was ‘one of the most significant finds ever made in the history of anthropology’. It was the first clear evidence for a missing link between apes and humans in Africa, and it set off a tidal wave of human-origins research that was to culminate a few decades later in universal acceptance for the African origin of humanity. However, most of this work was to focus on a region a few thousand miles away, in eastern Africa.

The African Rift Valley is part of a massive line of intense geological upheaval formed by the action of great tectonic plates that make up the earth’s crust. Roughly 2,000 miles long, it stretches from Eritrea in the north to Mozambique in the south, and is most recognizable by the series of lakes along its length – Turkana, Victoria, Tanganyika and Malawi, among others. This longitudinal gash has been a cauldron of activity over the past 20 million years, with volcanoes, lakes, mountains and rivers coming and going at a brisk pace. For this reason, the accumulated layers of millions of years – soil, volcanic ash, lake sediments – are constantly being tossed about and exposed. When this happens in east Africa, interesting things often turn up – all you have to do is look for them.

Louis Leakey had grown up in Kenya. The son of English missionaries, and raised in a Kikuyu village, he had spent his life looking for fossil remains in the valleys and riverbeds of the Rift. In 1959 at Olduvai, in northern Tanzania, his search was to pay off. It was nearing the end of the field season and, with research funds running on empty, Louis and his wife Mary were preparing to return to Nairobi. On the way back to camp one evening Mary stumbled upon a skull exposed by a recent rockslide. After painstakingly excavating the fossil over the next three weeks, the Leakeys returned to their laboratory at the Kenyan National Museum. The detailed analysis of the remains revealed it to be an Australopithecus, the first to be found in east Africa. But the shocker came when the layer of sediment surrounding the skull was dated using the newly developed technique of isotopic analysis, which calculates age based on the rate of radioactive decay. The skull had been buried 1.75 million years ago. This nearly doubled the length of time that most scientists had allowed for human evolution. Yet here was a missing link, midway between apes and humans, dating from that time. The scientific world was amazed – and encouraged. The massive boost in funding that the Leakeys and their colleagues received in the wake of the Olduvai discovery enabled them to find many more Australopithecines in the Rift over the subsequent thirty years.

The discovery of the Southern Ape Man in east Africa pointed the way towards modern humans, but it was only when unequivocal members of our own genus, Homo, were discovered there in the 1960s and 70s that the African origin hypothesis became widely accepted. The earliest Homo erectus fossils yet discovered date from around 1.8 million years ago, and they were found in east Africa (the African variant of Homo erectus is sometimes given the name Homo ergaster). Recent discoveries in the medieval city of Dmanisi, in the former Soviet Republic of Georgia, show that they left Africa soon thereafter – perhaps reaching east Asia within 100,000 years. From this we can infer that all Homo erectus around the world last shared a common ancestor in Africa nearly 2 million years ago. But according to the Berkeley mitochondrial data, Eve lived in Africa less than 200,000 years ago. How can we reconcile the two results?

It’s all about timing

Let’s step back for a moment and consider the case objectively. The evidence for an African Genesis of Homo erectus is circumstantial – we see evolutionary ‘missing links’ in Africa, either uniquely or first. These include an unbroken chain of ancestral hominids stretching back more than 5 million years to the recently discovered chimpanzee-like apes Ardipithecus. But is this evidence sufficient to conclude that Africa was also the birthplace of our species? Perhaps, but fossils can be misleading. Imagine finding a perfectly preserved Neanderthal skeleton in south-western France, dated accurately to 40,000 years ago, and one of Australopithecus, in Africa, dated to 2 million years before. Of these two extinct hominids, separated in time by millions of years and in place by thousands of miles, which is actually more likely to be a direct ancestor of modern Europeans? Oddly enough, it is not the obvious choice. As we’ll see later in the book, modern Europeans are almost certainly not the descendants of Neanderthals (despite what you may think of your colleague in the office next door), while the Southern Ape is, surprisingly, more likely to be our direct ancestor. Stones and bones inform our knowledge of the past, but they cannot tell us about our genealogy – only genes can do this.

So, the answer to our question about dates – how to reconcile 200,000 and 2 million – is that Homo erectus, despite its clear similarity to us, did not evolve into modern Homo sapiens independently in the far corners of the earth. Coon was wrong. Rather, the conclusion from the mitochondrial data is that modern humans evolved very recently in Africa, and subsequently spread to populate the rest of the globe, replacing our hominid cousins in the process. It’s a ruthless business, and only the winners leave a genetic trace. Unfortunately, Homo erectus appears to have lost.

As we’ll see, other genetic data corroborates the mitochondrial results, placing the root of the human family tree – our most recent common ancestor – in Africa within the past few hundred thousand years. Consistent with this result, all of the genetic data shows the greatest number of polymorphisms in Africa – there is simply far more variation in that continent than anywhere else. You are more likely to sample extremely divergent genetic lineages within a single African village than you are in whole of the rest of the world. The majority of the genetic polymorphisms found in our species are found uniquely in Africans – Europeans, Asians and Native Americans carry only a small sample of the extraordinary diversity that can be found in any African village.

Why does diversity indicate greater age? Thinking back to our hypothetical Provencal village, why do the bouillabaisse recipes change? Because in each generation, a daughter decides to modify her soup in a minor way. Over time, these small variations add up to an extraordinary amount of diversity in the village’s kitchens. And – critically – the longer the village has been accumulating these changes, the more diverse it is. It is like a clock, ticking away in units of rosemary and thyme – the longer it has been ticking, the more differences we see. It is the same phenomenon Emile Zuckerkandl noted in his proteins – more time equals more change. So, when we see greater genetic diversity in a particular population, we can infer that the population is older – and this makes Africa the oldest of all.

But does the placement of the root of our family tree in Africa mean that Coon was right, and Africans are frozen in some sort of ancestral evolutionary limbo? Of course not – all of the branches on the family tree change at the same rate, both within and outside of Africa, so there are derived lineages on each continent. That is the reason we see greater diversity within Africa – each branch has continued to evolve, accumulating additional changes. One of the interesting corollaries of inferring a single common ancestor is that each descendant lineage continues to change at the same rate, and therefore all of the lineages are the same age. The time that has elapsed between my mitochondrial DNA type and Eve’s is exactly the same as that of an African cattle herder, or a Thai boat captain, or a Yanomami hunter from Brazil – we are all the recent descendants of a single woman who lived in Africa less than 150,000 years ago.

This result begs the question of where Eve actually lived – where in Africa was the Garden of Eden? In one sense this is a red herring, since we know that there were many women alive all over Africa at this time. But, phrasing the question slightly differently, we can ask which populations in Africa retain the clearest traces of our genetic ancestors. Although the diversity within Africa has not by any means been sampled exhaustively, the picture that has emerged is that the oldest genetic lineages are found in people living in eastern and southern Africa. What we can infer from this is that these populations have maintained a direct mitochondrial link back to Eve, while the rest of us have lost some of these genetic signals along the way. We’ll pursue our search for Eden, using Adam as a guide, in the next chapter.

* Parsimony here is simply the application of methods which infer evolutionary history in such a way as to minimize complexity. It is not necessarily the method known as ‘maximum parsimony’ used by many population geneticists.

* Cavalli-Sforza and Edwards also developed other methods of analysing the relationship between populations on the basis of gene frequencies which rely less on absolute minimization of evolutionary change. Parsimony, however, is still widely used in the field.