Look at the five “words” below, knowing that they were written with an alphabet of 20 letters:
ILDIGDASAQELAEILKNAKTILWNGP
GLDIGPDSVKTFNDALDTrQJIIWNGP
GLDVGPKTRELFAAPIARAKLIVWNGP
GLDCGTESSKKYAEAVARAKQIVWNGP
GLDCGPESSKKYAEAVTRAKQIVWNGP
If I were to tell you the words were typed separately by five different monkeys, would you believe me? Not if you have taken more than a passing glance at them. “All five words end with WNGP,” you would point out to me, “and for monkeys hitting keyboards independently, this cannot be.” Actually it can. But the probability of such a coincidence is one in 655 billion billions. You would need a pretty large number of monkeys for five of them to have a reasonable chance of coming up with the same word ending. Surely, a more likely possibility is that the monkeys cheated. They copiedl!
Actually, the fraud is even more flagrant than appears at first sight. If you look more closely, you will see that four other letters, in addition to the terminal four, are the same in all five words (LD in position 2 and 3, G in position 5, and I in position 22). This lowers the odds of a fortuitous coincidence to one in 429,500 billion billion billion billions. Trillions of planets like ours could not possibly provide enough monkeys. And this is not all. Five other letters are the same in four out of the five words (G in position 1, S in position 8, A in position 13, and AK in positions 19-20). Even more striking, the two last words have 25 out of 27 letters in common; they differ only in positions 6 and 17. There can be no doubt. If monkeys there were, they most certainly did not hit their typewriters’ keys at random.
The words shown are not inventions. They represent real things, fragments of molecules called proteins, which are very long chains of up to several hundred units called amino acids, of which 20 different kinds are used in the assembly of the chains. Each word represents the sequence of a 27-amino acid piece (each letter standing for a given kind of amino acid) present somewhere in the heart of a large protein molecule containing more than 400 amino acids. This protein is an enzyme, or biological catalyst, known as phosphoglycerate kinase, PGK for short. PGK is a key participant in one of the most fundamental processes that take place in living organisms, the conversion of sugar to alcohol (or lactic acid), which occurs in virtually all forms of life, whether microbes of various sorts, plants, molds, or animals (including humans).
Now comes the central piece of information, which explains why the words serve as an introduction to this book. The five structures shown belong to the PGKs of five widely different organisms. The first one belongs to Escherichia coli, or colibacillus, a common microbe that we all harbor in our gut. The others are from the wheat, fruitfly, horse, and human PGKs, respectively:
Colibacillus: | ILDIGDASAQELAEILKNAKTILWNGP |
Wheat: | GLDIGPDSVKTFNDALDTTQJIIWNGP |
Fruitfly: | GLDVGPKTRELFAAPIARAKLIVWNGP |
Horse: | GLDCGTESSKKYAEAVARAKQIVWNGP |
Human: | GLDCGPESSKKYAEAVTRAKQIVWNGP |
What our monkey parable has brought to light is that the similarities among the PGKs of our sample organisms could not possibly be due to chance. A possibility could be—this, no doubt, would be the “creationist” view—that the similarities betray the intervention of a “hidden hand.” But, in that case, why the differences? Why, for example, does the human sequence differ from the fruitfly sequence in twelve amino acids and from the horse sequence in only two? No, the explanation given above for the monkeys is the correct one. The sequences show similarities because they were copied. And, they show differences because occasional copying mistakes were made. Thus, two mistakes would have been made in the horse and human lineages, twelve in the human (or horse) and fruitfly lineages, since their respective PGKs started being copied separately. Or, as shown graphically:
Make the additional assumption that it took some 40 million years, on an average, for one mistake to be made, and you get the following:
This, very roughly, is what paleontologists have long been telling us on the strength of fossil evidence. Humans and horses are derived from a common mammalian ancestor from which they diverged some 80 million years ago. The mammals themselves and the insects (the parent group of fruitflies) separated from a common ancestral form roughly 500 million years ago. What is new is that we can now estimate evolutionary times in terms of copying accidents (mutations) and that we can extend such estimates to lineages that have left no fossil remains. Also, we know how the copying takes place. It does not involve the protein molecules themselves, as suggested for simplicity’s sake; it involves the DNA genes that encode the amino acid sequences of the protein molecules. For the purpose of our argument, it amounts to the same thing.
More will be said about this fascinating topic in Chapter 7. The main point, for the time being, and the reason for this Introduction, is that there is now overwhelming evidence that all known living beings are descendants through evolution from a single ancestral form of life. Many cogent reasons support this affirmation. Its most convincing proof is provided by the molecular sequencing results.1 Even the very limited data presented in this Introduction should suffice to demonstrate the kinship among the five organisms mentioned (which, it should be noted, include us and the colibacilli of our intestinal tract). All the other available data—and their number is ever increasing—have confirmed this kinship and extended it to every other organism so far investigated. This fact is now so well established that researchers would be overjoyed if even one exception could be found—whether on Earth or elsewhere—because it would point to a second, independent origin for life.