14

The Realm of Life

The most complex things in the Universe – Charles Darwin and nineteenth-century theories of evolution – The role of cells in life – The division of cells – The discovery of chromosomes and their role in heredity – Intracellular pangenesis – Gregor Mendel: father of genetics – The Mendelian laws of inheritance – The study of chromosomes – Nucleic acid – Working towards DNA and RNA – The tetranucleotide hypothesis – The Chargaff rules – The chemistry of life – Covalent bond model and carbon chemistry – The ionic bond – Bragg’s law – Chemistry as a branch of physics – Linus Pauling – The nature of the hydrogen bond – Studies of fibrous proteins – The alpha-helix structure – Francis Crick and James Watson: the model of the DNA double helix – The genetic code – The genetic age of humankind – Humankind is nothing special

The most complex things in the Universe

We are the most complicated things that we know about in the entire Universe. This is because, on the cosmic scale of things, we are middle-sized. As we have seen, small objects, like atoms, are composed of a few simple entities obeying a few simple laws. As we shall see in the next chapter, the entire Universe is so big that the subtleties of even objects as large as stars can be ignored, and the whole cosmos can be treated as a single object made up of a reasonably smooth distribution of mass-energy, again obeying a few very simple laws. But on scales where atoms are able to join together to make molecules, although the laws are still very simple, the number of compounds possible – the number of different ways in which atoms can join together to make molecules – is so great that a huge variety of different things with complicated structures can exist and interact with one another in subtle ways. Life as we know it is a manifestation of this ability for atoms to form a complex variety of large molecules. This complexity starts on the next scale up from atoms, with simple molecules such as water and carbon dioxide; it ends where molecules begin to be crushed out of existence by gravity, once we are dealing with the interiors of objects the size of large planets, and even atoms are entirely stripped of their electrons by the time we are dealing with objects the size of stars.

The exact size of a lump of matter needed to destroy the complexity on which life as we know it depends is determined by the different strengths of the electromagnetic and gravitational forces. The electrical forces that hold molecules together are 1036 times stronger than the gravitational forces that try to crush molecules out of existence in a lump of matter. When atoms are together in a lump of matter there is no overall electric charge, because each atom is electrically neutral. So each atom is essentially on its own when it comes to withstanding gravity through the strength of QED. But the strength of the inward gravitational force on each atom in the lump of matter increases with the addition of every extra atom that is contributed to the lump. The amount of mass in a sphere with a certain density is proportional to the cube of the radius (for constant density), but the strength of the gravitational force falls off in accordance with an inverse square law, so in terms of the radius of a lump of matter, gravity at the surface ‘gains’ on electric forces in accordance with a two-thirds power. This means, since 36 is two-thirds of 54, that when 1054 atoms are together in a single lump, gravity dominates and complicated molecules are broken apart.

Imagine starting out with a set of objects made up of 10 atoms, 100 atoms, 1000 atoms and so on, with each lump containing ten times more atoms than the one before. The twenty-fourth object would be as big as a sugar cube, the twenty-seventh would be about the size of a large mammal, the fifty-fourth would be the size of the planet Jupiter and the fifty-seventh would be about as big as the Sun, where even atoms are destroyed by gravity, leaving a mixture of nuclei and free electrons called a plasma. On this logarithmic scale, people are almost exactly halfway in size between atoms and stars. The thirty-ninth object in our collection would be equivalent to a rock about a kilometre in diameter, and the realm of life forms like ourselves can reasonably be said to be between the sizes of sugar lumps and large rocks. This is more or less the realm investigated by Charles Darwin and his successors in establishing the theory of evolution by natural selection. But the basis for the complexity of life that we see around us on these scales depends on chemical processes going on at a slightly deeper level, where, we now know, DNA is the key component of life. The story of how DNA was identified as the key to life is the second great story of twentieth-century science, and, like the story of quantum physics, it began almost exactly with the dawn of the new century, although in this case there had been a neglected precursor to the new discoveries.

Charles Darwin and nineteenth-century theories of evolution

From the time of the great debate stirred by the publication of the Origin of Species in 1859, understanding of the process of evolution by natural selection had at best marked time, and arguably went backwards, during the rest of the nineteenth century. One reason was the problem of the timescale required for evolution, which we have already mentioned, and which was only resolved in the twentieth century by an understanding of radioactivity. But although Darwin (and others) fought the case for the long timescale required by evolution, the strength of the case put forward by the physicists (in particular William Thomson/Lord Kelvin) put even Darwin on the defensive. The other, and even more important, reason was that Darwin and his contemporaries did not understand the mechanism by which characteristics are passed on from one generation to the next – the mechanism of heredity. That, too, would not become clear until well into the twentieth century.

Darwin’s own ideas about heredity were first presented to the world in 1868, in a chapter at the end of his book Variation of Animals and Plants under Domestication; they indicate the way many biologists thought at the time, although Darwin offered the most complete model. He gave it the name ‘pangenesis’, from the Greek ‘pan’, to indicate that every cell in the body contributed, combined with ‘genesis’, to convey the idea of reproduction. His idea was that every cell in the body contributes tiny particles (which he called ‘gemmules’) which are carried through the body and are stored in the reproductive cells, egg or sperm, to be passed on to the next generation. The model also incorporated the idea of blending inheritance, which says that when two individuals combine to produce offspring, the offspring represent a blend of the characteristics of the parents. To modern eyes, it is startling to see Charles Darwin himself promoting this idea, which implies that, for example, the children of a tall woman and a short man should grow up to some intermediate height. This runs completely against the basic tenet of evolution by natural selection, the requirement of variation among individuals to select from, since in a few generations, blending inheritance would produce a uniform population. The fact that Darwin even considered such an idea shows how far biologists were from a true understanding of inheritance at the time. It is against this background that we see Darwin’s many revisions of the Origin leaning more and more towards the Lamarckian position, while his opponents argued that evolution could not proceed by the series of tiny steps envisaged in the original version of natural selection, because intermediate forms (such as a proto-giraffe with a neck longer than that of a deer but too short for it to browse on treetops) would not be viable.1 Critics of Darwin, such as the splendidly named Englishman St George Jackson Mivart (1827–1900), suggested that evolution required sudden changes in body plan from one generation to the next, with a deer, in effect, giving birth to a giraffe. But they had no mechanism for this process either (except for the hand of God), and Darwin was at least on the right lines when he highlighted the importance of individual cells in reproduction, and even with his idea that the reproductive cells contain tiny ‘particles’ which carry information from one generation to the next.

The role of cells in life

The division of cells

The role of cells as the fundamental component of living things had only become clear at the end of the 1850s, the same time that Darwin was presenting his theory of evolution by natural selection to a wide audience. The realization was driven largely by improving microscopic instruments and techniques. Matthias Schleiden (1804–1881) proposed in 1838 that all plant tissues are made of cells, and a year later Theodor Schwann (1810–1882) extended this to animals, suggesting that all living things are made up of cells. This led to the idea (suggested by, among others, John Goodsir (1814–1867)) that cells arise only from other cells, by division, and it was this idea that was taken up and developed by Rudolf Virchow (1821–1902) in a book, Die Cellularpathologie, published in 1858. Virchow, then professor of pathology in Berlin, explicitly stated that ‘every cell is derived from a preexisting cell’, and applied this doctrine to his field of medicine, suggesting that disease is no more than the response of a cell (or cells) to abnormal conditions. In particular, he showed that tumours are derived from pre-existing cells in the body. This proved immensely fruitful in many ways, and produced an explosion of interest in the study of the cell; but Virchow put all of his theoretical eggs in one basket and was strongly opposed to the ‘germ’ theory of infection (he also rejected the theory of evolution by natural selection). This means that although he made many important contributions to medicine, served in the Reichstag (where he was an opponent of Otto von Bismarck) and worked on the archeological dig to discover the site of Homer’s Troy in 1879, he made no further direct contribution to the story we have to tell here.

The discovery of chromosomes and their role in heredity

The microscopic techniques available at the time were more than adequate to show the structure of the cell as a bag of watery jelly with a central concentration of material, known as the nucleus. They were indeed so good that in the late 1870s both Hermann Fol (1845–1892) and Oskar Hertwig (1849–1922) independently observed the penetration of the sperm into the egg (they worked with sea urchins, which have the invaluable property of being transparent), with two nuclei fusing to form a single new nucleus, combining material provided by (inherited from) both parents. In 1879, yet another German, Walther Flemming (1843–1915), discovered that the nucleus contains thread-like structures which readily absorb coloured dyes used by microscopists to stain cells and highlight their structure; the threads became known as chromosomes. Flemming and the Belgian Edouard van Beneden (1846–1910) independently observed, in the 1880s, the way in which chromosomes were duplicated and shared between the two daughter cells when a cell divided. August Weismann (1834–1914), working at the University of Freiburg, took up this line of study in the 1880s. It was Weismann who pointed to chromosomes as the carriers of hereditary information, stating that ‘heredity is brought about by the transmission from one generation to another of a substance with a definite chemical and, above all, molecular constitution’.2 He gave this substance the name ‘chromatin’, and spelled out the two kinds of cell division that occur in species like our own. During the kind of cell division associated with growth and development, all the chromosomes in a cell are duplicated before the cell divides, so each daughter cell obtains a copy of the original set of chromosomes; during the kind of cell division that produces egg or sperm cells, the amount of chromatin is halved, so that a full set of chromosomes is only restored when two such cells fuse to create the potential for the development of a new individual.3 It was Weismann who showed, by the early years of the twentieth century, that the cells responsible for reproduction are not involved with other processes going on in the body, and the cells that make up the rest of the body are not involved with the manufacture of reproductive cells, so that Darwin’s idea of pangenesis is definitely wrong, and the Lamarckian idea of outside influences from the environment directly causing variations from one generation to the next could be ruled out (not that this stopped the Lamarckians from arguing their case well into the twentieth century). The later discovery that radiation can cause what are now known as mutations, by directly damaging the DNA in the reproductive cells, in no way diminishes the power of Weismann’s argument, since these random changes are almost invariably deleterious, and certainly do not adapt the descendants of the affected organism more closely to their environment.

Intracellular pangenesis

At about the same time that Weismann was probing inside the cell to identify the chemical units that are the carriers of heredity, the Dutch botanist Hugo de Vries (1848–1935) was working with whole plants to gain an insight into the way characteristics are passed on from one generation to the next. In 1889, just seven years after Darwin had died, de Vries published a book, Intracellular Pangenesis, in which he tried to adapt Darwin’s ideas to the picture of how cells worked that was then beginning to emerge. Combining this with observations of how heredity works in plants, he suggested that the characteristics of a species must be made up from a large number of distinct units, each of them due to a single hereditary factor which was passed on from one generation to the next more or less independently of the others. He gave the hereditary factors the name ‘pangens’ (sometimes translated into English as ‘pangenes’), from Darwin’s term pangenesis; after the studies by Weismann (and others) which showed that the whole body is not involved in producing these hereditary factors, the ‘pan’ was quietly dropped, giving us the familiar modern term ‘gene’, first used by the Dane Wilhelm Johannsen, in 1909.

Gregor Mendel: father of genetics

In the 1890s, de Vries carried out a series of plant-breeding experiments in which he carefully recorded the way in which particular characteristics (such as the height of a plant or the colour of its flowers) could be traced down the generations. Similar studies were being carried out at the same time in England by William Bateson (1861–1926), who later coined the term ‘genetics’ to refer to the study of how heredity works. By 1899, de Vries was ready to prepare his work for publication, and while doing so carried out a survey of the scientific literature in order to place his conclusions in their proper context. It was only at this point that he discovered that almost all of the conclusions he had reached about heredity had been published already, in a seldom read, and even less frequently cited, pair of papers by a Moravian monk, Gregor Mendel. The work had actually been described by Mendel in two papers he read to the Natural Science Society in Brünn (as it then was; now Brno, in the Czech Republic) in 1865, and published a year later in its Proceedings. It is easy to imagine de Vries’s feelings when he made this discovery. Perhaps a little disingenuously, he published his own findings in two papers which appeared early in 1900. The first, in French, made no mention of Mendel. But the second, in German, gives almost fulsome credit to his predecessor, commenting that ‘this important monograph is so rarely quoted that I myself did not become acquainted with it until I had concluded most of my experiments, and had independently deduced the above propositions’,4 and summing up:

From this and numerous other experiments I drew the conclusion that the law of segregation of hybrids as discovered by Mendel for peas finds very general application in the plant kingdom and that it has a basic significance for the study of the units of which the species character is composed.

This was clearly an idea whose time had come. In Germany, Karl Correns (1864–1933), working along similar lines, had also recently come across Mendel’s papers, and was preparing his own work for publication when he received a copy of de Vries’ French paper. And in Austria, Erich Tschermak von Seysenegg suffered a similar fate.5 The overall result was that the genetic basis of heredity soon became firmly established, and each of the three rediscoverers of the basic principles involved gave due credit to Mendel as the real discoverer of the laws of heredity. This was certainly true, but the ready acknowledgement of Mendel’s priority should not be seen entirely as an act of selfless generosity – after all, with three people having a claim to the ‘discovery’ in 1900, it suited each of them to acknowledge a now-dead predecessor rather than get into an argument among themselves about who had done the work first. There is, though, an important historical lesson to be drawn from the story. Several people made similar discoveries independently at the end of the 1890s because the time was ripe and the groundwork had been laid by the identification of the nucleus and the discovery of chromosomes. Remember that the nucleus itself was only identified in the same year that the joint paper by Darwin and Wallace was read to the Linnean Society, 1858, while Mendel’s results were published in 1866. It was an inspired piece of work, but it was ahead of its time and made little sense on its own, until people had actually seen the ‘factors of heredity’ inside the cell, and the way in which they were separated and recombined to make new packages of genetic information. But although Mendel’s work, as it happens, had no influence at all on the development of biological science in the second half of the nineteenth century, it’s worth taking a brief look at what he did, both to counter some of the misconceptions about the man and to emphasize the really important feature of his work, which is often overlooked.

37. Gregor Mendel

Mendel was not some rural gardener in monk’s habit who got lucky. He was a trained scientist, who knew exactly what he was doing, and was one of the first people to apply the rigorous methods of the physical sciences to biology. Born on 22 July 1822 in Heinzendorf in Moravia (then part of the Austrian Empire) and christened Johann (he took the name Gregor when he joined the priesthood), Mendel was clearly an unusually intelligent child, but came from a poor farming family, which exhausted all its financial resources in sending the bright young man through high school (gymnasium) and a two-year course at the Philosophical Institute in Olmütz, intended to prepare him for university. With university itself beyond his financial means, in 1843 Mendel joined the priesthood as the only means of furthering his education, having been headhunted by the abbot of the monastery of St Thomas, in Brünn. The abbot, Cyrill Franz Napp, was in the process of turning the monastery into a leading intellectual centre where the priests included a botanist, an astronomer, a philosopher and a composer, all with high reputations outside the monastery wall. Abbot Napp was eager to add to his community of thinkers by recruiting bright young men with ability but no other prospects, and was introduced to Mendel by Mendel’s physics professor at Olmütz, who had previously worked in Brünn. Mendel completed his theological studies in 1848, and worked as a supply teacher at the nearby gymnasium and later at the technical college, although because of severe examination nerves he repeatedly failed the examinations that would have regularized his position.

Mendel showed such ability that in 1851, at the age of 29, he was sent to study at the University of Vienna, where Christian Doppler was professor of physics (to put the date in another context relevant to that city, Johann Strauss the younger was 26 in 1851). He was allowed only two years away from the monastery for this privileged opportunity, but crammed into that time studies of experimental physics, statistics and probability, the atomic theory of chemistry and plant physiology, among others. He did not take a degree – that was never the abbot’s intention – but returned to Brünn better equipped than ever for his role as a teacher. But this was not enough to satisfy his thirst for scientific knowledge. In 1856, Mendel began an intensive study of the way heredity works in peas,6 carrying out painstaking and accurate experiments over the next seven years which led him to discover the way heredity works. He had a plot of land 35 metres long and 7 metres wide in the monastery garden, a greenhouse, and all the time he could spare from his teaching and religious duties. He worked with about 28,000 plants, from which 12,835 were subjected to careful examination. Each plant was identified as an individual, and its descendants traced like a human family tree, in marked contrast to the way biologists had previously planted varieties en masse and tried to make sense out of the confusion of hybrids that resulted (or simply studied plants in the wild). Among other things, this meant that Mendel had to pollinate each of his experimental plants by hand, dusting pollen from a single, known individual plant on to the flowers of another single, known individual plant, and keeping careful records of what he had done.

38. A diagram illustrating an aspect of Mendel’s unsung paper on heredity, showing pea plants.

The Mendelian laws of inheritance

The key to Mendel’s work – the point which is often overlooked – is that he worked like a physicist, carrying out repeatable experiments and, most significantly, applying proper statistical methods to the analysis of his results, the way he had been taught in Vienna. What the work showed is that there is something in a plant that determines the properties of its overall form. We might as well call that something by its modern name, gene. The genes come in pairs, so that (in one of the examples studied by Mendel) there is a gene S, which results in smooth seeds, and a gene R, that results in rough seeds, but any individual plant will carry one of the possible combinations SS, RR or SR. Only one of the genes in a pair, though, is expressed in the individual plant (in the ‘phenotype’, as it is known). If the plant carries RR or SS, it has no choice but to use the appropriate gene and produce rough or smooth seeds. But if it carries the combination RS, you might expect half the plants to have rough seeds and half to have smooth seeds. This is not the case. The R is ignored and only the S is expressed in the phenotype. In this case, S is said to be dominant and R is recessive. Mendel worked all this out from the statistics, which in this case start from the observation that when RR plants (that is, plants from a line that always produces rough seeds) are crossed with SS plants (from a line which always produces smooth seeds), 75 per cent of the offspring have smooth seeds and only 25 per cent rough. The reason, of course, is that there are two ways to make RS offspring (RS and SR), which are equivalent. So in the next generation the individuals are evenly distributed among four genotypes, RR, RS, SR and SS, of which only RR will have rough seeds. This is just the simplest example (and only looking at the first generation, whereas Mendel actually carried the statistics forward to the ‘grandchildren’ and beyond) of the kind of analysis Mendel (and later de Vries, Bateson, Correns, von Seysenegg and many other people) used in their studies. Mendel had shown conclusively that inheritance works not by blending characteristics from the two parents, but by taking individual characteristics from each of them. By the early 1900s, it was clear (from the work of people such as William Sutton at Columbia University) that the genes are carried on the chromosomes, and that chromosomes come in pairs, one inherited from each parent. In the kind of cell division that makes sex cells, these pairs are separated, but only (we now know) after chunks of material have been cut out of the paired chromosomes and swapped between them, making new combinations of genes to pass on to the next generation.

Mendel’s discoveries were presented to a largely uncomprehending Natural Science Society in Brünn (few biologists had any understanding of statistics in those days) in 1865, when he was 42 years old. The papers were sent out to other biologists, with whom Mendel corresponded, but their importance was not appreciated at the time. Perhaps Mendel might have promoted his work more vigorously and ensured that it received more attention; but in 1868, Cyrill Franz Napp died, and Gregor Johann Mendel was elected Abbot in his place. His new duties gave him little time for science, and his experimental plant breeding programme was essentially abandoned by his forty-sixth year, although he lived until 6 January 1884.

The rediscovery of the Mendelian laws of inheritance at the beginning of the twentieth century, combined with the identification of chromosomes, provided the keys to understanding how evolution works at the molecular level. The next big step was taken by the American Thomas Hunt Morgan, who was born in Lexington, Kentucky, on 25 September 1866, and who became professor of zoology at Columbia University in 1904. Morgan came from a prominent family – one of his great-grandfathers was Francis Scott Key, who wrote the US National anthem; his father served for a time as US Consul at Messina, in Sicily; and one uncle had been a Colonel in the Confederate army. In an echo of the way Robert Millikan was sceptical about Einstein’s ideas concerning the photoelectric effect (and another shining example of the scientific method at work), Morgan had doubts about the whole business of Mendelian inheritance, which rested on the idea of hypothetical ‘factors’ being passed from one generation to the next. The possibility that these factors might be carried by chromosomes existed, but Morgan was not convinced, and began the series of experiments that would lead to him receiving the Nobel prize (in 1933) in the expectation of proving that the simple laws discovered by Mendel were at best a special case that applied only to a few simple properties of particular plants, and did not have general application to the living world.

The study of chromosomes

The organism Morgan chose to work with was the tiny fruit fly Drosophila. The name means ‘lover of the dew’, but it is actually the fermenting yeast, not dew, that attracts them to rotting fruit. In spite of the obvious difficulties of working with insects rather than plants, Drosophila have one great advantage for students of heredity. Whereas Mendel had to wait for a year to inspect the next generation of peas at each stage in his breeding programme, the little flies (each only an eighth of an inch long) produce a new generation every two weeks, with each female laying hundreds of eggs at a time. It was pure luck, though, that it turned out that Drosophila have only four pairs of chromosomes, which made Morgan’s investigation of how characteristics are passed from one generation to the next much easier than it might have been.7

One pair of those chromosomes has a particular significance, as in all sexually reproducing species. Although the individual chromosomes in most pairs are similar in appearance to one another, in the pair that determines sex there is a distinct difference in the shapes of these chromosomes, and from these shapes they are known as X and Y. You might think that there are three possible combinations that may occur in a particular individual – XX, XY and YY. But in females the cells always carry the XX pair, while in males the combination is XY.8 So a new individual must inherit an X chromosome from its mother, and could inherit either X or Y from its father; if it inherits an X from its father it will be female; if it inherits a Y it will be male. The point of all this is that Morgan found a variety of flies among his Drosophila that had white eyes instead of the usual red eyes. A careful breeding programme and statistical analysis of the results showed that the gene (a term which Morgan soon took up and promoted) affecting the colour of the eyes of the insect must be carried on the X chromosome, and that it was recessive. In males, if the variant gene (the different varieties of a particular gene are known as alleles) was present on the single X chromosome they had white eyes. But in a female, the relevant allele had to be present on both X chromosomes for the white-eye characteristic to show up in the phenotype.

This first result encouraged Morgan to continue his work in the second decade of the twentieth century, in collaboration with a team of research students. Their work established that chromosomes carry a collection of genes like beads strung out along a wire, and that during the process that makes sperm or egg cells, paired chromosomes are cut apart and rejoined to make new combinations of alleles. Genes that are far apart in the chromosome are more likely to be separated when this process of crossing over and recombination occurs, while genes that are close together on the chromosome only rarely get separated; this (and a lot of painstaking work) provided the basis for mapping the order of genes along the chromosome. Although a great deal more work of this kind remained to be done, using the improving technology of the later twentieth century, the time when the entire package of Mendelian heredity and genetics finally came of age can be conveniently dated to 1915, when Morgan and his colleagues A. H. Sturtevant, C. B. Bridges and H. J. Muller published their classic book, The Mechanism of Mendelian Heredity. Morgan himself went on to write The Theory of the Gene (1926), moved to Caltech in 1928, received that Nobel prize in 1933 and died at Corona del Mar in California on 4 December 1945.

Evolution by natural selection only works if there is a variety of individuals to select from. So the understanding developed by Morgan and his colleagues, of how the constant reshuffling of the genetic possibilities provided by the process of reproduction encourages diversity, also explains why it is so easy for sexually reproducing species to adapt to changing environmental conditions. Asexual species do evolve, but only much more slowly. In human beings, for example, there are about 30,000 genes that determine the phenotype. Just over 93 per cent of these genes are homozygous, which means that they are the same on each chromosome of the relevant pair, in all human beings. Just under 7 per cent are heterozygous, which means that there is a chance that there are different alleles for that particular gene on the paired chromosomes of an individual person chosen at random. These different alleles have arisen by the process of mutation, of which more later, and sit in the gene pool having little effect unless they confer some advantage on the phenotype (mutations that cause a disadvantage soon disappear; that is what natural selection is all about). With some 2000 pairs of genes which come in at least two varieties (some have more than two alleles), that means that there are 2 to the power of 2000 ways (22000 ways) in which two individual people could be different from one another. This is such a spectacularly large number that even astronomical numbers (like those we will encounter in the next chapter) pale by comparison, and it means not only that no two people on Earth are genetically identical (except for twins who share the same genotype because they come from the same fertilized egg), but that no two people who have ever lived have been exactly the same as each other. That is some indication of the variety on which natural selection operates. After 1915, as the nature of chromosomes, sex, recombination and heredity became increasingly clear, the big question was what went on at a deeper level, within the nucleus and within the chromosomes themselves. The way to answer that question would involve the latest developments in quantum physics and chemistry as scientists probed the secrets of life at the molecular level; but the first steps on the path to the double helix of DNA had been taken in a distinctly old-fashioned way almost half a century before.

Nucleic acid

The person who took those first steps was a Swiss biochemist, Friedrich Miescher (1844–1895). His father (also called Friedrich) was professor of anatomy and physiology in Basle from 1837 to 1844, before moving on to Bern, and young Friedrich’s maternal uncle, Wilhelm His (1831–1904), held the same chair from 1857 to 1872. His was a particularly strong influence on his nephew, only 13 years his junior, who studied medicine at Basle before going to the University of Tübingen, where he studied organic chemistry under Felix Hoppe-Seyler (1825–1895) from 1868 to 1869, then he spent a spell in Leipzig before returning to Basle. When His moved the other way in 1872, leaving Basle for Leipzig, his chair was divided into two, one for anatomy and one for physiology; young Miescher got the physiology chair, clearly partly as a result of literal nepotism. He stayed in the post until he died, of tuberculosis, on 16 August 1895, just three days after his fifty-first birthday.

Miescher went to work in Tübingen because he was interested in the structure of the cell (an interest encouraged by his uncle, and very much in the mainstream of biological research at the time); Hoppe-Seyler had not only established the first laboratory devoted to what is now called biochemistry, but was a former assistant of Rudolf Virchow, with a keen interest in how cells worked – remember that Virchow had laid down the doctrine that living cells are created only by other living cells scarcely ten years before Miescher went to Tübingen. After discussing the possibilities for his own first research project with Hoppe-Seyler, Miescher settled on an investigation of the human white blood cells known as leucocytes. These had the great advantage, from a practical if not an aesthetic point of view, of being available in large quantities from the pus-soaked bandages provided by a nearby surgical clinic. Proteins were already known to be the most important structural substances in the body, and the expectation was that the investigation carried out by Miescher would identify the proteins that were involved in the chemistry of the cell, and which were, therefore, the key to life. Overcoming the difficulties of washing the intact cells free from the bandages without damaging them, and then subjecting them to chemical analysis, Miescher soon found that the watery cytoplasm that fills the volume of the cell outside the nucleus is indeed rich in proteins; but further studies showed that there was something else present in the cell as well. After removing all the outer material and collecting large numbers of undamaged nuclei free of cytoplasm (something nobody else had achieved before), Miescher was able to analyse the composition of the nucleus and found that it was significantly different from that of protein. This substance, which he called ‘nuclein’, contains a lot of carbon, hydrogen, oxygen and nitrogen, like other organic molecules; but he also found that it contained a significant amount of phosphorus, unlike any protein. By the summer of 1869 Miescher had confirmed that the new substance came from the nuclei of cells and had identified it not only in leucocytes from pus but in cells of yeast, kidney, red blood cells and other tissues.

News of Miescher’s discovery did not create the sensation you might expect – indeed, it was a long time before anybody outside Hoppe-Seyler’s lab learned of it. In the autumn of 1869 Miescher moved on to Leipzig, where he wrote up his discoveries and sent them back to Tübingen to be published in a journal that Hoppe-Seyler edited. Hoppe-Seyler found it hard to believe the results and stalled for time while two of his students carried out experiments to confirm the discovery. Then, in July 1870 the Franco-Prussian war broke out and the general turmoil of the war delayed publication of the journal. The paper eventually appeared in print in the spring of 1871, alongside the work confirming Miescher’s results, and with an accompanying note from Hoppe-Seyler explaining that publication had been delayed due to unforeseen circumstances. Miescher continued his studies of nuclein after he became a professor at Basle, concentrating on the analysis of sperm cells from salmon. The sperm cell is almost all nucleus, with only a trace of cytoplasm, since its sole purpose is to fuse with the nucleus of a more richly endowed egg cell and contribute hereditary material for the next generation. Salmon produce enormous quantities of sperm, growing thin on their journey to the spawning grounds as body tissue is converted into this reproductive material. Indeed, Miescher pointed out that structural proteins from the body must be broken down and converted into sperm in this way, itself an important realization that different parts of the body can be deconstructed and rebuilt in another form. In the course of this work, he found that nuclein was a large molecule which included several acidic groups; the term ‘nucleic acid’ was introduced to refer to the molecules in 1889, by Richard Altmann, one of Miescher’s students. But Miescher died without ever knowing the full importance of what he had discovered.

Like virtually all of his biochemical colleagues, Miescher failed to appreciate that nuclein could be the carrier of hereditary information. They were too close to the molecules to see the overall picture of the cell at work, and thought of these seemingly relatively simple molecules as some kind of structural material, perhaps a scaffolding for more complicated protein structures. But cell biologists, armed with the new staining techniques that revealed chromosomes, could actually see how genetic material was shared out when cells divided, and were much quicker to realize the importance of nuclein. In 1885, Oskar Hertwig wrote that ‘nuclein is the substance responsible not only for fertilization but also for the transmission of hereditary characteristics’,9 while in a book published in 189610 the American biologist Edmund Wilson (1856–1939) wrote more fulsomely:

Chromatin is to be regarded as the physical basis of inheritance. Now chromatin is known to be closely similar to, if not identical with, a substance known as nuclein…a tolerably definite chemical compound of nucleic acid (a complex organic acid rich in phosphorus) and albumin. And thus we reach the remarkable conclusion that inheritance may, perhaps, be effected by the physical transmission of a particular compound from parent to offspring.

Working towards DNA and RNA

But it was to be a tortuous road before Wilson’s ‘remarkable conclusion’ was confirmed.

Progress down that road depended on identifying the structure of nuclein, and the basic building blocks of the relevant molecules (though not, as yet, the details of how the building blocks were joined together) were all identified within a few years of Miescher’s death – some even before he died. The building block which gives its name to DNA is ribose, a sugar whose central structure consists of four carbon atoms linked with an oxygen atom in a pentagonal ring, with other atoms (notably hydrogen–oxygen pairs, OH) attached at the corners. These attachments can be replaced by other molecules, linking the ribose units to them. The second building block, which attaches in just this way, is a molecular group containing phosphorus, and is known as a phosphate group – we now know that these phosphate groups act as links between ribose pentagons in an alternating chain. The third and final building block comes in five varieties, called ‘bases’, known as guanine, adenine, cytosine, thymine and uracil, and usually referred to simply by their initials, as G, A, C, T and U. One base, it was later discovered, is attached to each of the sugar rings in the chain, sticking out at the side. The ribose pentagon gives the overall molecule its name, ribonucleic acid, or RNA; an almost identical type of molecule (only identified in the late 1920s) in which the sugar units each have one less oxygen atom (H where the ribose has OH) is called deoxyribonucleic acid (DNA). The one other difference between RNA and DNA is that although each of them contains only four of the bases, RNA contains G, A, C and U, while DNA contains G, A, C and T. It was this discovery that reinforced the idea that nuclein was nothing more than a structural molecule and held back the development of a proper understanding of its role in heredity.

The tetranucleotide hypothesis

The person who was most responsible for this misunderstanding was the Russian-born American Phoebus Levene (1869–1940), who was a founder member of the Rockefeller Institute in New York in 1905, and stayed there for the rest of his career. He played a leading part in identifying the way the building blocks of RNA are linked together, and was actually the person who eventually identified DNA itself, in 1929; but he made an understandable mistake which, thanks to his prestige and influence as a leading biochemist, had an unfortunately wide influence. When he was born (in the same year that Miescher discovered nuclein), in the little town of Sagor, Levene was given the Jewish name Fishel, which was changed to the Russian Feodor when his family moved to St Petersburg when he was two years old. When the family emigrated to the United States in 1891 to escape the latest anti-Jewish pogroms, he changed this to Phoebus in the mistaken belief that that was the English equivalent; by the time he found out that he should have chosen Theodore, there didn’t seem much point in changing it again. Levene’s understandable mistake resulted from analysis of relatively large amounts of nucleic acid. When this was broken down into its component building blocks for analysis, it turned out to contain almost equal amounts of G, A, C and U (the yeast cells used in this work yielded RNA). This led him to conclude that the nucleic acid was a simple structure made up of four repeating units, linked together in the way we have already described; it even seemed possible that a single molecule of RNA contained just one of each of the four bases. This package of ideas became known as the tetranucleotide hypothesis – but instead of being treated as a hypothesis and tested properly, it was accorded the status of dogma and accepted more or less without question by far too many of Levene’s contemporaries and immediate successors. Since proteins were known to be very complicated molecules made up from a large variety of amino acids linked together in different ways, this reinforced the idea that all the important information in the cell was contained in the structure of proteins, and that the nucleic acids simply provided a simple supporting structure that held the proteins in place. There is, after all, very little information in a ‘message’ that contains only one word, GACU, repeated endlessly. Even by the end of the 1920s, though, evidence was beginning to emerge that would lead to an understanding that there is more to nucleic acid than scaffolding. The first hint emerged in 1928, a year before Levene finally identified DNA itself.

The clue came from the work of Fred Griffith (1881–1941), a British microbiologist working as a medical officer for the Ministry of Health, in London. He was investigating the bacteria that cause pneumonia, and had no intention of seeking out any deep truths about heredity. But just as fruit flies breed faster than pea plants and therefore can, under the right circumstances, show how inheritance works more quickly, so microorganisms such as bacteria reproduce more swiftly than fruit flies, going through many generations in a matter of hours, and can show in a matter of weeks the kind of changes that would only be revealed by many years of work with Drosophila. Griffith had discovered that there are two kinds of pneumococci bacteria, one which was virulent and caused a disease that was often fatal, the other producing little or no ill effects. In experiments with mice aimed at finding information which might help the treatment of pneumonia in people, Griffith found that the dangerous form of the pneumococci could be killed by heat, and that these dead bacteria could be injected into mice with no ill effects. But when the dead bacteria were mixed with bacteria from the non-lethal variety of pneumococci, the mixture was almost as virulent to the mice as the pure strain of live virulent pneumococci. Griffith himself did not discover how this had happened, and he died before the true importance of his work became clear (he was killed in an air raid during the Blitz), but this discovery triggered a change of direction by the American microbiologist Oswald Avery (1877–1955), who had been working on pneumonia full time at the Rockefeller Institute in New York since 1913.

During the 1930s, and on into the 1940s, Avery and his team investigated the way in which one form of pneumococci could be transformed into another form in a series of long, cautious and careful experiments. They first repeated Griffith’s experiments, then found that simply growing a colony of non-lethal pneumococci in a standard glass dish (a Petri dish) which also contained dead, heat-treated cells from the virulent strain was sufficient to transform the growing colony into the virulent form. Something was passing from the dead cells into the living pneumococci, being incorporated into their genetic structure and transforming them. But what? The next step was to break cells apart by alternately freezing and heating them, then use a centrifuge to separate out the solid and liquid debris that resulted. It turned out that the transforming agent, whatever it was, was in the liquid fraction, not the insoluble solids, narrowing down the focus of the search. All of this work kept various people in Avery’s lab busy until the mid-1930s. It was at this point that Avery, who had previously overseen the work in his lab but not been directly involved in these experiments, decided on an all-out attack to identify the transforming agent, which he carried out with the aid of two younger researchers, the Canadian born Colin MacLeod (1909–1972) and, from 1940, Maclyn McCarty (1911–), from South Bend, Indiana.

Partly because of Avery’s insistence on painstaking attention to detail, partly because of the disruption caused by the Second World War and partly because what they found was so surprising that it seemed hard to believe,11 it took until 1944 for Avery, MacLeod and McCarty to produce their definitive paper identifying the chemical substance responsible for the transformation that had first been observed by Griffith in 1928. They proved that the transforming substance was DNA – not, as had been widely assumed, a protein. But even in that 1944 paper, they did not go so far as to identify DNA with the genetic material, although Avery, now 67 (a remarkable age for someone involved in such a fundamental piece of scientific research) did speculate along those lines to his brother Roy.12

The Chargaff rules

The implications, though, were clear for those who had eyes to see, and in another passing on of the torch, the publication of the Avery, MacLeod and McCarty paper in 1944 stimulated the next key step, which was taken by Erwin Chargaff (1905–). Chargaff was born in Vienna, where he gained his PhD in 1928, the year of Griffith’s discovery, spent two years at Yale, then returned to Europe, where he worked in Berlin and Paris, before settling permanently in America in 1935 and spending the rest of his career at Columbia University. Accepting the evidence that DNA could convey genetic information, Chargaff realized that DNA molecules must come in a great variety of types, with a more complicated internal structure than had previously been appreciated. Using the new techniques of paper chromatography (familiar in its simplest incarnation from schoolday experiments where inks are spread out into their component colours as they travel at different speeds through blotting paper) and ultraviolet spectroscopy, Chargaff and his colleagues were able to show that although the composition of DNA is the same within each species they studied, it is different in detail from one species to the next (although still, of course, DNA). He suggested that there must be as many different kinds of DNA as there are species. But as well as this variety on the large scale, he also found that there is a degree of uniformity underlying this complexity of DNA molecules. The four different bases found in molecules of DNA come in two varieties. Guanine and adenine are each members of a chemical family known as purines, while cytosine and thymine are both pyrimidines. What became known as the Chargaff rules were published by him in 1950. They said that, first, the total amount of purine present in a sample of DNA (G + A) is always equal to the total amount of pyrimidine present (C + T); second, the amount of A is the same as the amount of T, while the amount of G is the same as the amount of C. These rules are a key to understanding the famous double-helix structure of DNA. But in order to appreciate just how this structure is held together, we need to take stock of the developments in chemistry that followed the quantum revolution.

The chemistry of life

Starting out with the work of Niels Bohr, and culminating in the 1920s, quantum physics was able to explain the patterns found in the periodic table of the elements and give insight into why some atoms like to link up with other atoms to make molecules, while some do not. The details of the models depend on calculations of the way energy is distributed among the electrons in an atom, which is always in such a way as to minimize the overall energy of the atom, unless the atom has been excited by some energetic influence from outside. We do not need to go into the details here, but can jump straight to the conclusions, which were clear even on Bohr’s model of the atom, although they became more securely founded with the developments of the 1920s. The most important difference is that where Bohr originally thought of electrons as like tiny, hard particles, the full quantum theory sees them as spread-out entities, so that even a single electron can surround an atomic nucleus, like a wave.

The quantum properties of electrons only allow certain numbers of electrons to occupy each energy level in an atom, and although it is not strictly accurate, you can think of these as corresponding to different orbits around the nucleus. These energy states are sometimes known to chemists as ‘shells’, and although several electrons may occupy a single shell, you should envisage each individual electron as being spread out over the entire volume of the shell. It turns out that full shells, in the sense that they have the maximum number of electrons allowed, are energetically favoured over partly full shells. Whatever element we are dealing with, the lowest energy state for the individual atoms (the shell ‘nearest the nucleus’) has room for just two electrons in it. The next shell has room for eight electrons, and so has the third shell, although we then run into complications which are beyond the scope of the present book. A hydrogen atom has a single proton in its nucleus and, therefore, a single electron in its only occupied shell. Energetically, this is not so desirable a state as having two electrons in the shell, and hydrogen can achieve at least a kind of halfway house to this desirable state by linking up with other atoms in such a way that it gets at least a share of a second electron. In, for example, molecules of hydrogen (H2) one electron from each atom contributes to a pair shared by, and surrounding, both nuclei, giving an illusion of a full shell. But helium, with two electrons in its only occupied shell, is in a very favourable energetic state, atomic nirvana, and does not react with anything.

Moving up the ladder of complexity, lithium, the next element, has three protons in its nucleus (plus, usually, four neutrons) and therefore three electrons in its cloud. Two of these slot into the first shell, leaving one to occupy the next shell on its own. The most obvious feature of an atom to another atom, determining its chemical properties, is the outermost occupied shell – in this case, the single electron in the outermost occupied shell – which is why lithium, eager to give away a share in this lone electron in a way we describe shortly, is highly reactive and has similar chemical properties to hydrogen. The number of protons in a nucleus is the atomic number of that particular element. Adding protons to the nucleus and electrons to the second shell (and ignoring the neutrons, which play essentially no part in chemistry at this level) takes us up to neon, which has ten protons and ten electrons in all, two in the innermost shell and eight in the second shell. Like helium, neon is an inert gas – and by now you can see where the repeating pattern of chemical properties for elements eight units apart in the periodic table comes from. Just one more example will suffice. Adding yet another proton and electron takes us up from neon to sodium, which has two closed inner shells and a single electron out on its own; and sodium, with atomic number 11, has similar chemical properties to lithium, with atomic number 3.

Covalent bond model and carbon chemistry

The idea of bonds forming between atoms as they shared pairs of electrons to complete effectively closed shells was developed, initially on a qualitative basis, by the American Gilbert Lewis (1875–1946), in 1916. It is known as the covalent bond model, and is particularly important in describing the carbon chemistry that lies at the heart of life, as the simplest example shows. Carbon has six protons in its nucleus (and six neutrons, as it happens), plus six electrons in its cloud. Two of these electrons, as usual, sit in the innermost shell, leaving four to occupy the second shell – exactly half the number required to make a full shell. Each of these four electrons can pair up with the electron offered by an atom of hydrogen, so that a molecule of methane (CH4) is formed in which the carbon atom in the middle has the illusion of a full shell of eight electrons and each of the four hydrogen atoms on the outside has the illusion of a full shell of two electrons. If there were five electrons in the outer shell, the central atom would only need to make three bonds to complete its set; if it only had three electrons, it could only make three bonds, however much it might ‘want’ to make five. Four bonds is the maximum any atom can make,13 and bonds are stronger for shells closer to the central nucleus, which is why carbon is the compound maker par excellence. Replace one or more of the hydrogen atoms with something more exotic – including, perhaps, other carbon atoms or phosphate groups – and you begin to see why carbon chemistry has so much potential to produce a wide variety of complex molecules.

The ionic bond

There is, though, another way in which atoms can form bonds, which brings us back to lithium and sodium. They can both form bonds in this way, but we shall use sodium as an example, because this kind of bond is found in a very common everyday substance – common salt, NaCl. The bond is known as an ionic bond, and the idea was developed by several people as the nineteenth century turned into the twentieth century, although most of the credit for the basis of the idea probably belongs to the Swede Svante Arrhenius (1859–1927), who received the Nobel prize for his work on ions in solution in 1903. Sodium, as we have seen, has two full inner shells and a single electron out on its own. If it could get rid of that lone electron, it would be left with an arrangement of electrons similar to that of neon (not quite identical to neon, because the extra proton in the sodium nucleus means that it holds on to the electrons a tiny bit more tightly), which is favoured energetically. Chlorine, on the other hand, has no fewer than 17 electrons in its cloud (and, of course, 17 protons in its nucleus), arranged in two full shells and a third shell of seven electrons, with that one ‘hole’ where another electron could fit. If a sodium atom gives up an electron completely to a chlorine atom, both of them achieve nirvana, but at the cost of being left with an overall electric charge – positive for the sodium, negative for the chlorine. The resulting ions of sodium and chlorine are held together by electric forces in a crystalline array, which is rather like a single huge molecule – molecules of NaCl do not exist as independent units in the way molecules of H2 or CH4 do.

In quantum physics, though, things are seldom as clear cut and straightforward as we would like them to be, and chemical bonds are best thought of as a mixture of these two processes at work, with some more covalent but with a mixture of ionic, some more ionic with a mixture of covalent and some more or less 50:50 (even in molecules of hydrogen, you can envisage one hydrogen atom giving up its electron completely to the other). But all of these images are no more (and no less) than crutches for our imagination. What matters is that the energies involved can be calculated, with great accuracy. Indeed, within a year of Schrödinger publishing his quantum mechanical wave equation, and just a year before Griffith’s key work with pneumococci, in 1927 two German physicists, Walter Heitler (1904–1981) and Fritz London (1900–1954), had used this mathematical approach to calculate the change in overall energy when two hydrogen atoms, each with its own single electron, combine to form one molecule of hydrogen with a pair of shared electrons. The change in energy that they calculated very closely matched the amount of energy which chemists already knew, from experiment, was required to break the bonding between the atoms in a hydrogen molecule. Later calculations, made as the quantum theory was improved, gave even better agreement with experiment. The calculations showed that there was no arbitrariness in the arrangement of electrons in atoms and atoms in molecules, but that the arrangements which are most stable in the atoms and molecules are always the arrangements with the least energy. This was crucially important in making chemistry a quantitative science right down at the molecular level; but the success of this approach was also one of the first, and most powerful, pieces of evidence that quantum physics applies in general, and in a very precise way, to the atomic world, not just to isolated special cases like the diffraction of electrons by crystals.

The person who put all of the pieces together and made chemistry a branch of physics was the American Linus Pauling (1901–1994). He was another of those scientists who was the right man, in the right place, at the right time. He gained his first degree, in chemical engineering, from Oregon State Agricultural College (a forerunner of Oregon State University) in 1922, then studied for a PhD in physical chemistry at Caltech; he was awarded this degree in 1925, the year that Louis de Broglie’s ideas about electron waves began to gain attention. For the next two years, exactly during the time that quantum mechanics was being established, Pauling visited Europe on a Guggenheim Fellowship. He worked for a few months in Munich, then in Copenhagen at the Institute headed by Niels Bohr, spent some time with Erwin Schrödinger in Zürich and visited William Bragg’s laboratory in London.

Bragg, and in particular his son Lawrence, are also key figures in the story of the discovery of the structure of DNA. The elder Bragg, William Henry, lived from 1862 to 1942, and is always known as William Bragg. He graduated from Cambridge University in 1884, and after a year working with J. J. Thomson moved to the University of Adelaide, in Australia, where his son William Lawrence (always known as Lawrence Bragg) was born. He worked on alpha rays and X-rays, and after returning to England in 1909, working at Leeds University until 1915, then moving to University College London, he developed the first X-ray spectrometer, to measure the wavelength of X-rays. In 1923 he was appointed Director of the Royal Institution, reviving it as a centre of research and establishing the laboratory which Pauling visited a few years later. It was William Bragg who first had the dream of using X-ray diffraction to determine the structure of complex organic molecules, although the technology available to him in the 1920s was not yet up to this task.

Lawrence Bragg (1890–1971) studied mathematics at the University of Adelaide (graduating in 1908) then moved to Cambridge, where he initially continued with mathematics but switched to physics in 1910 at his father’s suggestion, graduating in 1912. So Lawrence was just starting as a research student in Cambridge and William was a professor in Leeds when news came from Germany in 1912 that Max von Laue (1879–1960), working at the University of Munich, had observed the diffraction of X-rays by crystals.14 This is exactly equivalent to the way light is diffracted in the double-slit experiment, but because the wavelengths of X-rays are much shorter than those of light, the spacing between the ‘slits’ has to be much smaller; it turns out that the spacing between layers of atoms in a crystal is just right to do the job. This work established that X-rays are indeed a form of electromagnetic wave, like light but with shorter wavelengths; the importance of the breakthrough can be gauged by the fact that von Laue received the Nobel prize for the work just two years later, in 1914.

Bragg’s law

Chemistry as a branch of Physics

Von Laue’s team had found what were certainly complicated diffraction patterns, but were not immediately able to work out details of how the patterns related to the structure of the crystals the X-rays were diffracting from. The Braggs discussed the new discoveries with one another, and each worked on different aspects of the problem. It was Lawrence Bragg who worked out the rules which made it possible to predict exactly where the bright spots in a diffraction pattern would be produced when a beam of X-rays with a particular wavelength struck a crystal lattice with a particular spacing between atoms at a particular angle. Almost as soon as X-ray diffraction had been discovered, it was established that it could be used to probe the structure of crystals, once the wavelengths involved had been measured (which is where William Bragg’s spectrometer, built in 1913, would come in). The relationship Lawrence came up with soon became known as Bragg’s law, and it made it possible to work in either direction – by measuring the spacing of the bright spots in the pattern you could determine the wavelength of the X-rays if you knew the spacing of atoms in the crystal, and once you knew the wavelength of the X-rays you could use the same technique to measure the spacing between atoms in a crystal, although interpreting the data soon became horribly complicated for complex organic structures. It was this work which showed that in substances such as sodium chloride there are no individual molecules (NaCl), but an array of sodium ions and chlorine ions arranged in a geometric pattern. The two Braggs worked together, and published together, over the next couple of years, producing the book X Rays and Crystal Structure in 1915 – just twenty years after the discovery of X-rays. The previous year, Lawrence had become a Fellow of Trinity College, but his academic career was interrupted by war service as a technical adviser to the British Army in France; it was while he was there that he learned, in 1915, that he and his father had received the Nobel prize for their work. Lawrence was the youngest person (at 25) to receive the award, and the Braggs are the only father-and-son team to have shared the award for their joint work. In 1919, Lawrence Bragg became professor of physics at Manchester University, and in 1938 he succeeded Rutherford as head of the Cavendish Lab, where he will soon come back into the story of the double helix. When he left Cambridge in 1954, he too became Director of the RI, until he retired in 1966.

Linus Pauling

Pauling had learned about X-ray crystallography as a student, largely from the book written by William and Lawrence Bragg, and had carried out his own first determination of a crystal structure using the technique in 1922 (the crystal was molybdenite). When he returned to the United States, taking up a post at Caltech in 1927 and becoming a full professor there in 1931, he had all the up-to-date ideas about X-ray crystallography at his fingertips, and soon developed a set of rules for interpreting the X-ray diffraction patterns from more complicated crystals. Lawrence Bragg developed essentially the same set of rules at the same time, but Pauling published first, rather to Bragg’s chagrin, and to this day the expressions are known as Pauling’s rules. This established a rivalry between Pauling and Bragg that was to last into the 1950s, and would play a part in the discovery of the structure of DNA.

At this time, however, Pauling’s main interest was in the structure of the chemical bond, which he explained in quantum mechanical terms over the next seven years or so. As early as 1931, following another visit to Europe imbibing the new ideas in quantum physics, he produced a great paper, ‘The Nature of the Chemical Bond’, which was published in the Journal of the American Chemical Society; this laid all the groundwork. It was followed by six more papers elaborating on the theme over the next two years, then by a book rounding everything up. ‘By 1935,’ Pauling later commented, ‘I felt that I had an essentially complete understanding of the nature of the chemical bond.’15 The obvious thing to do was to move on to using this understanding to elucidate the structure of complex organic molecules, such as proteins (remember that DNA was still not regarded as a very complex molecule in the mid-1930s). These structures yielded to a two-pronged investigation – chemistry and an understanding of the chemical bond told people like Pauling how the subunits of the big molecules were allowed to fit together (in the case of proteins, the subunits are amino acids), while X-ray crystallography told them about the overall shapes of the molecules. Only certain arrangements of the subunits were allowed by chemistry, and only certain arrangements of the subunits could produce the observed diffraction patterns. Combining both pieces of information with model building (sometimes as simple as pieces of paper cut into the shapes of the molecular subunits and pushed around like pieces of a jigsaw puzzle, sometimes more complicated modelling in three dimensions) eliminated many impossible alternatives and eventually, after a lot of hard work, began to reveal the structures of the molecules important to life. An enormous amount of work by researchers such as Pauling himself, Desmond Bernal (1901–1971), Dorothy Hodgkin (1910–1994), William Astbury (1889–1961), John Kendrew (1917–1977), Max Perutz (1914–2002) and Lawrence Bragg enabled biochemists to determine, over the next four decades, the structure of many biomolecules, including haemoglobin, insulin and the muscle protein myoglobin. The importance of this work, both in terms of scientific knowledge and in terms of the implications for improved human healthcare, scarcely needs pointing out; but, like medicine itself, the full story is not one we can go into here. The thread we want to pick up, leading on to the determination of the structure of DNA, is the investigation of the structure of certain proteins by Pauling and by his British rivals; but before we do so there is one more piece of quantum chemistry that needs to be mentioned.

The nature of the hydrogen bond

The existence of so-called ‘hydrogen bonds’ highlights the importance of quantum physics to chemistry – in particular, the chemistry of life – and brings home the way in which the quantum world differs from our everyday world. Chemists already knew that under certain circumstances it is possible to form links between molecules that involve a hydrogen atom as a kind of bridge. Pauling wrote about this hydrogen bond, which is weaker than a normal covalent or ionic bond, as early as 1928, and returned to the theme in the 1930s, first in the context of ice (where hydrogen bonds form bridges between water molecules) and then, with his colleague Alfred Mirsky, applied the idea to proteins. The explanation of hydrogen bonding requires you to think of the single electron associated with the proton in a hydrogen atom as smeared out in a cloud of electric charge, not as a tiny billiard ball. When the hydrogen atom is involved in forming a conventional bond with an atom such as oxygen, which strongly attracts its electron, the cloud of charge is pulled towards the other atom, leaving only a thin covering of negative charge on the other side of the hydrogen atom. Unlike all other chemically reactive atoms (helium is not chemically reactive), hydrogen has no other electrons in inner shells to help conceal the positive charge on its proton, so some of the positive charge is ‘visible’ to any other nearby atoms or molecules. This will attract any nearby atom which has a preponderance of negative charge – such as an oxygen atom in a water molecule, which has gained extra negative charge from its two hydrogen atoms. In water molecules, the positive charge on each of the two hydrogen atoms can link in this way with the electron cloud on another water molecule (one for each hydrogen atom), which is what gives ice a very open crystalline structure, with such a low density that it floats on water. The value of Pauling’s work on ice is that, once again, he put numbers into all of this, calculating the energies involved16 and showing that they matched the values revealed in experiments. In his hands, the idea of the hydrogen bond became precise, quantitative science, not a vague, qualitative idea. In proteins, as Pauling and Mirsky demonstrated in the mid-1930s, when long-chain protein molecules fold up into compact shapes (not unlike the way the toy known as Rubik’s snake folds up into compact shapes), they are held in those shapes by hydrogen bonds which operate between different parts of the same protein chain. This was a key insight, since the shape of a protein molecule is vital to its activity in the machinery of the cell. And all this thanks to a phenomenon, the hydrogen bond, which simply cannot be explained properly except in terms of quantum physics. It is no coincidence that our understanding of the molecular basis of life came after our understanding of the rules of quantum mechanics, and once again we see science progressing by evolution, not revolution.

Studies of fibrous proteins

The first great triumph to result from the combination of a theoretical understanding of how the subunits of proteins can fit together and the X-ray diffraction patterns produced by the whole molecules (actually by many whole molecules side by side in a sample), came with the determination of the basic structure of a whole family of proteins, the fibrous kind found in hair, wool and fingernails, at the beginning of the 1950s. What was to be a long road to that triumph began when William Astbury was working in William Bragg’s group of crystallographers at the Royal Institution in London in the 1920s. It was here that Astbury started his work on biological macromolecules with X-ray diffraction studies of some of these fibres, providing the first X-ray diffraction pictures of fibrous protein and continuing this line of research after he moved to the University of Leeds in 1928. In the 1930s, he came up with a model for the structure of these proteins that was actually incorrect, but it was Astbury who showed that globular protein molecules (such as haemoglobin and myoglobin) are made up of long-chain proteins (polypeptide chains) that are folded up to make balls.

The alpha-helix structure

Pauling came into the story in the late 1930s, and later recalled how he ‘spent the summer of 1937 in an effort to find a way of coiling a polypeptide chain in three dimensions, comparable with the X-ray data reported by Astbury’.17 But it would take much longer than a single summer to solve the problem. Fibrous proteins looked more promising, but with the Second World War intervening, it was in the late 1940s that both Pauling and his colleagues at Caltech (notably Robert Corey) and Lawrence Bragg (by now head of the Cavendish) and his team in Cambridge closed in on the solution. Bragg’s group published first, in 1950 – but it soon turned out that their model was flawed, even though it contained a great deal of the truth. Pauling’s team came up with the correct solution in 1951, identifying the basic structure of fibrous protein as being made up of long polypeptide chains wound round one another in a helical fashion, like the strands of string that are wound together to make a rope, with hydrogen bonds playing an important part in holding the coils in shape. This was a spectacular triumph in itself, but the world of biochemistry was almost overwhelmed when the Caltech team published seven separate papers in the May 1951 issue of the Proceedings of the National Academy of Sciences, laying out in detail the chemical structure of hair, feathers, muscles, silk, horn and other proteins, as well as the alpha-helix structure, as it became known, of the fibres themselves. The fact that the structure was helical certainly set other people thinking about helices as possible structures for other biological macromolecules, but what was equally important was the overwhelming success of the whole approach used by Pauling, combining the X-ray data, model building and a theoretical understanding of quantum chemistry. As Pauling has emphasized, the alpha-helix structure was determined ‘not by direct deduction from experimental observations on proteins but rather by theoretical considerations based on the study of simpler substances’.18 The example inspired the work of the two people who would very soon determine the structure of DNA itself, snatching the prize from under the noses not just of the Caltech team, but another group working on the problem in London.

It was obvious that Pauling would now turn his attention to DNA, which, as we have seen, had been identified as the genetic material by the 1940s.19 It is also easy to imagine how Lawrence Bragg, now twice pipped at the post by Pauling, might have longed for an opportunity for the structure of DNA to be determined at his own laboratory in Cambridge. In fact, this should not have been possible, not for scientific reasons but because of the way the limited funding available for scientific research in Britain, where the economy was still slowly recovering from the effects of the war, restricted the freedom of researchers. There were only two groups capable of tackling the problem of the structure of DNA, one under Max Perutz at the Cavendish, the other under John Randall (1905–1984) at King’s College in London, both funded by the same organization, the Medical Research Council (MRC); and there was every reason to avoid a duplication of effort which might result in a waste of limited resources. The result was an understanding (nothing formal, but a well-understood gentlemen’s agreement) that the King’s team had first crack at DNA. The snag, for anyone who cared about such things, was that the team at King’s, headed by Maurice Wilkins (1916–), did not seem to be in any great hurry to complete the work, and was also handicapped by the way in which Rosalind Franklin (1920–1958), a young researcher who produced superb X-ray diffraction photographs from DNA and should have been Wilkins’s partner, was largely frozen out by him in a personality clash which seems to have been at least partly based on prejudice against her as a woman.

Francis Crick and James Watson: the model of the DNA double helix

It was the disarray among the team (‘team’ in name only) at King’s that opened a window of opportunity for a brash young American, James Watson (1928–), who turned up in Cambridge in 1951 on a post-doctoral scholarship, fired with a determination to work out the structure of DNA and neither knowing nor caring anything about English gentlemen’s agreements. Watson was given space in the same room as a rather older English PhD student, Francis Crick (1916–), who turned out to have a complementary background and approach to Watson, and was soon recruited to the cause. Crick had started out as a physicist, and carried out war work on mines for the Admiralty. But, like many physicists of his generation, he became disillusioned with physics as a result of seeing its application to the war. He was also, again like many of his contemporaries, influenced by a little book called What is Life?, written by Erwin Schrödinger and published in 1944, in which the great physicist had looked at the problem of what is now called the genetic code from a physicist’s point of view. Although at the time he wrote the book Schrödinger did not know that chromosomes were made of DNA, he spelled out in general terms that ‘the most essential part of a living cell – the chromosome fibre – may suitably be called an aperiodic crystal’, drawing a distinction between an ordinary crystal such as one of common salt, with its endless repetition of a simple basic pattern, and the structure you might see in ‘say, a Raphael tapestry, which shows no dull repetition but an elaborate, coherent, meaningful design’, even though it is made up of a few colours arranged in different ways. Another way of looking at the storing of information is in terms of the letters of the alphabet, which spell out information in words, or a code such as the Morse code, with its dots and dashes arranged in patterns to represent letters of the alphabet. Among several examples of the way in which information could be stored and passed on in such an aperiodic crystal, Schrödinger noted that in a code similar to the Morse code but with three symbols, not just dot and dash, used in groups of ten, ‘you could form 88,572 different “letters” ’. It was against this background that the physicist Crick joined the MRC unit at the Cavendish as a research student in 1949, at the late age of 33. For his thesis, he was working on X-ray studies of polypeptides and proteins (and duly received his degree in 1953); but he will always be remembered for the unofficial work he carried out on the side, when he should have been concentrating on his PhD, at the instigation of Watson.

This work was entirely unofficial – indeed, Crick was twice told by Bragg to leave DNA to the King’s team, and twice ignored him, only gaining any kind of formal approval from the Cavendish professor in the later stages of the investigation when it seemed as if Pauling was about to crack the puzzle. Although the theoretical insight and practical modelling were important, everything depended on the X-ray diffraction photographs, and the first such images of DNA had been obtained by Astbury only in 1938. These were not improved upon (again, in no small measure because of the hiatus caused by the war) until the 1950s, when Wilkins’s group (in particular, Rosalind Franklin, assisted by a research student, Raymond Gosling) took up the subject; indeed, Pauling’s work on the structure of DNA was handicapped by only having Astbury’s old data to work with. Using data Watson had gleaned from a talk given by Franklin at King’s, and which he had not properly understood, the Cavendish pair soon came up with a model for DNA, involving the strands twining around one another with the nucleotide bases (A, C, G and T) sticking out from the sides, which was proudly presented to Wilkins, Franklin and two of their colleagues from London, who were specially invited to Cambridge for the presentation. The model was so embarrassingly bad, and the comments it elicited so acerbic, that even the ebullient Watson retreated into his shell for a time, while Crick went back to his proteins. But in the summer of 1952, in a conversation with the mathematician John Griffith (a nephew of Frederick Griffith, and himself very interested in, and knowledgeable about, biochemistry), Crick tossed out the idea that the nucleotide bases in the DNA molecule might fit together somehow, to hold the molecules together. Mildly interested, Griffith worked out from the shapes of the molecules that adenine and thymine could fit together, linking up through a pair of hydrogen bonds, while guanine and cytosine could also fit together, linking up through a set of three hydrogen bonds, but that the four bases could not pair up in any other way. Crick did not immediately appreciate the importance of this pairing, nor the relevance of the hydrogen bonds, and as a newcomer to biochemistry he was unaware of Chargaff’s rules. In a rare piece of serendipity, however, in July 1952 Chargaff himself visited the Cavendish, where he was introduced to Crick and, learning of his interest in DNA, mentioned the way in which samples of DNA always contain equal amounts of A and G, and equal amounts of C and T. This, combined with Griffith’s work, clearly suggested that the structure of DNA must involve pairs of long-chain molecules, linked together by AG and CT bridges. It even turns out that the length of a CT bridge formed in this way is the same as the length of an AG bridge formed in this way, so there would be an even spacing between the two molecular chains. But for months the Cavendish team tossed the idea around among themselves, without doing any serious work on it. They were only galvanized into another frantic burst of model building (Watson did most of the model building, Crick provided most of the bright ideas) right at the end of 1952. In December, Peter Pauling, a graduate student at the Cavendish and son of Linus Pauling, received a letter from his father saying that he had worked out the structure of DNA. The news spread gloom in the Watson–Crick camp, but there were no details of the model in the letter. In January 1953, though, Peter Pauling received an advance copy of his father’s paper, which he showed to Watson and Crick. The basic structure was a triple helix, with three strands of DNA chains wound round one another. But to their amazement, Crick and Watson (by now a little wiser in the ways of X-ray diffraction patterns) realized that Pauling had made a blunder, and that his model could not possibly match the data being obtained by Franklin.

A few days later, Watson took the copy of Pauling’s paper to London to show it to Wilkins, who responded by showing Watson a print of one of Franklin’s best photographs, in a serious breach of etiquette, without her knowledge. It was this picture, which could only be interpreted in terms of a helical structure, plus the Chargaff rules and the relationships worked out by John Griffith, that enabled Crick and Watson to produce their famous model of the double helix, with the entwined molecules held together by hydrogen bonds linking the nucleotide bases in the middle, by the end of the first week of March 1953. As it happens, Pauling was not in the race at the time, since he had not yet realized that his triple-helix model was wrong – indeed, he never really thought of there being a race, since he never knew how close his rivals in England were to the goal. But Franklin, at King’s, was thinking along very similar lines to Crick and Watson (without the physical model building) and was almost ready to publish her own version of the double helix when the news came from Cambridge. She had actually prepared the first draft of a paper for Nature the day before. The burst of activity triggered by Pauling’s premature paper had resulted in Crick and Watson snatching the prize from under the nose not of Pauling, but of Franklin. The immediate result was that three papers appeared alongside one another in the issue of Nature dated 25 April 1953. The first, from Crick and Watson, gave details of their model, and stressed its relationship to the Chargaff rules, downplaying the X-ray evidence; the second, from Wilkins and his colleagues A. R. Stokes and H. R. Wilson, presented X-ray data which suggested in general terms a helical structure for the DNA molecule; the third, from Franklin and Gosling, gave the compelling X-ray data indicating the kind of double-helix structure for DNA proposed by Crick and Watson, and was (although nobody else knew it at the time) essentially the paper Franklin had been working on when the news came from Cambridge. What nobody also knew at the time, or could have guessed from the presentation of the three papers, is that rather than being just a confirmation of the work by Crick and Watson, the paper by Franklin and Gosling represented a completely independent discovery of the detailed structure of DNA, and that the Crick and Watson discovery was largely based on Franklin’s work. It was only much later that it emerged just how the crucial X-ray data got to Cambridge, what a vital role it had played in the model building, and just how badly Franklin had been treated both by her colleague at King’s and by Watson and Crick. Franklin herself, happy to be leaving King’s in 1953 for the more congenial environment at London’s Birkbeck College, never felt hard done by – but then she never knew the whole truth, since she died in 1958, from cancer, at the age of 38. Crick, Watson and Wilkins shared the Nobel prize for Physiology or Medicine just four years later, in 1962.

39. Watson, Crick and their model of a molecule of DNA, 1951.

The genetic code

There are two key features of the double-helix structure of DNA which are important for life, reproduction and evolution. The first is that any combination of bases – any message written in the letters A, C, G and T – can be spelled out along the length of a single strand of DNA. During the 1950s and into the early 1960s, the efforts of many researchers, including Crick (Watson never did anything else to compare with his work with Crick on the double helix) and a team at the Pasteur Institute in Paris, showed that the genetic code is actually written in triplets, with sets of three bases, such as CTA or GGC, representing each of the twenty or so individual amino acids used in the proteins that build and run the body. When proteins are being manufactured by the cell, the relevant part of the DNA helix containing the appropriate gene uncoils, and a string of three-letter ‘codons’ is copied into a strand of RNA (which raises interesting questions about whether RNA or DNA was the first molecule of life); this ‘messenger RNA’, whose only essential difference to DNA is that it has uracil everywhere DNA has thymine, is then used as a template to assemble a string of amino acids corresponding to the codons, which are linked together to make the required protein. It keeps doing this until no more of that particular protein is required. The DNA has long since coiled up again, and after enough protein has been manufactured the RNA is disassembled and its components reused. Just how the cell ‘knows’ when and where to do all this remains to be explained, but the principles of the process were clear by the mid-1960s.

The other important feature of the DNA double helix is that the two strands are, in terms of their bases, mirror images of one another, with every A on either strand opposite T on the other, and every C opposite a G. So if the two strands are unwound, and a new partner is built for each of them from the chemical units available in the cell (as happens prior to cell division20), in the two new double helices, one of which from each pair goes into each daughter cell, there will be the same genetic message, with the letters of the code in the same order, and with A opposite T and C opposite G. Although the details of the mechanism are subtle and, again, not yet fully understood, it is immediately obvious that this also provides a mechanism for evolution. During all the copying of DNA that goes on when cells divide, there must occasionally be mistakes. Bits of DNA get copied twice, or bits get left out, or one base (one ‘letter’ in the genetic code) gets accidentally replaced by another. None of this matters much in the kind of cell division that produces growth, since all that happens is that a bit of DNA in a single cell (probably not even a bit of DNA that that particular cell uses) has been changed. But when reproductive cells are produced by the special process of division that halves the amount of DNA in the daughter cells, not only is there more scope for mistakes to occur (thanks to the extra processes involved in crossing over and recombination), but if the resulting sex cell successfully fuses with a partner and develops into a new individual, all of the DNA, including the mistakes, gets a chance to be expressed. Most of the resulting changes will be harmful, making the new individual less efficient, or at best neutral; but those rare cases when a DNA copying error produces a gene, or gene package, that makes its owner better fitted to its environment are all that Darwinian evolution needs for natural selection to operate.

The genetic age of humankind

From the perspective of our theme of how science has altered humankind’s perception of our own place in nature, this is as far as we need to take the story of DNA. A great deal of work has been carried out since the 1960s in determining the composition of genes at the level of DNA codons, and a great deal more has yet to be carried out before we will understand the processes by which some genes control the activity of other genes, and in particular the way genes are ‘switched on’ as required during the complicated process of development of an adult from a single fertilized egg cell. But to see where we fit in to the tapestry of life, and to see just how accurate Charles Darwin’s assessment of man’s place in nature was, we can step back from these details and look at the broader picture. From the 1960s onward, as biochemists investigated the genetic material of human beings and other species in more and more detail, it gradually became clear just how closely related we are to the African apes, who Darwin himself regarded as our closest living relatives. By the late 1990s, it had been established that human beings share 98.4 per cent of their genetic material with the chimpanzee and the gorilla, making us, in popular terminology, only ‘one per cent human’. From various lines of attack, comparing the genetic material of more or less closely related living species with fossil evidence of when those species split from a common stock, this amount of genetic difference can be used as a kind of molecular clock, and tells us that the human, chimp and gorilla lines split from a common stock just four million years ago.

The fact that such a small genetic difference can produce creatures as different as ourselves and chimps already suggested that the important differences must lie in those control genes that regulate the behaviour of other genes, and this interpretation of the evidence has been supported by the evidence from the human genome project, which completed its mapping of all the DNA in every chromosome of the human genome in 2001. The resulting map, as it is sometimes called, simply lists all the genes in terms of strings of codons, A, T, C and G; it is not yet known what most of the genes actually do in the body. But the immediate key feature of the map is that it shows that human beings have only about 30,000 genes, a much smaller number than anyone had predicted, although the 30,000 genes are capable of making at least 250,000 proteins. This is only twice as many genes as the fruit fly, and just 4000 more than a garden weed called thale cress, so it is clear that the number of genes alone does not determine the nature of the body they build. Human beings do not have many more genes than other species, so the number of genes on its own cannot explain the ways in which we are different from other species. Again, the implication is that a few key genes are different in us, compared with our closest relatives, and that these are affecting the way the other genes operate.

Humankind is nothing special

Underpinning all this, though, is the bedrock fact that none of these comparisons would be possible if all the species being investigated did not use the same genetic code. At the level of DNA and the mechanisms by which the cell operates, involving messenger RNA and the manufacture of proteins, as well as in reproduction itself, there is absolutely no difference between human beings and other forms of life on Earth. All creatures share the same genetic code, and we have all evolved in the same way from primordial forms (perhaps a single primordial form) of life on Earth. There is nothing special about the processes that have produced human beings, compared with the processes that have produced chimpanzees, sea urchins, cabbages or the humble wood louse. And our removal from centre stage is just as profound when we look at the place of the Earth itself in the Universe at large.

1.This is not the place to go into the details of why those critics were wrong, but if you want to know how evolution does work to, among other things, turn a deer into a giraffe, the best place to start is with Richard Dawkins’ book The Blind Watchmaker.

2.Quoted by David Young in The Discovery of Evolution.

3.All this applies, of course, to sexual reproduction. Asexual reproduction is, by and large, much simpler, with daughter cells being exact replicas of the parent cell (but see Gribbin and Cherfas, The Mating Game; as a sexually reproducing species ourselves, however, it is sexual reproduction that is central to our own story.

4.Translation from Iltis.

5.Tschermak was a 26-year-old graduate student at the time, who certainly discovered Mendel’s papers independently, but made only a minor contribution in his own right, compared with those of de Vries and Correns.

6.He chose peas because he knew that they had distinctive characteristics that bred true and would be susceptible to statistical analysis.

7.People have 23 pairs of chromosomes; but there is no simple relationship between the complexity of the phenotype and number of chromosomes, and some ferns have more than 300 chromosome pairs in every cell.

8.The pattern is reversed in a few species, and there are other oddities, but these are not important here.

9.Jenaische Zeitschrift für Medizin und Naturwizzenschaft, volume 18, p. 276; translation from Lagerkvist.

10.The Cell in Development and Inheritance. Wilson was the professor of zoology at Columbia University, head of the department where Morgan would carry out his fruit fly experiments,

11.After all, it flew in the face of the tetranucleotide hypothesis, and Levene was a towering, influential figure at the Rockefeller until his death in 1940.

12.See Judson.

13.Under normal circumstances; there are always exceptions, but this is not the place to discuss them.

14.To be precise, von Laue devised the experiment, which was actually carried out by Walther Friedrich and Paul Knipping, at the Institute of Theoretical Physics in Munich; this echoes the way Ernest Rutherford devised the experiment carried out by Hans Geiger and Ernest Marsden which revealed the existence of the atomic nucleus.

15.See Judson. This was no idle boast, but a simple statement of fact; Pauling duly received the Nobel prize for this work in 1954. In 1962, he received the Nobel peace prize, for his work in the campaign for nuclear disarmament.

16.In this case, it was actually the entropy that Pauling was investigating, but the principle is the same.

17.See Judson.

18.See Chemistry.

19.Any lingering doubts were removed at around this time by a brilliant experiment in which the Americans Alfred Hershey and Martha Chase, working at the Cold Spring Harbor Laboratory on Long Island, proved that the genetic material of viruses is composed of DNA.

20.The strands do not unwind entirely before copying begins. Instead, as the double helix starts to untwist, new partners begin to build up for each strand, twining around them as the process continues, so that by the time the untwisting of the original helix has finished, the two daughter helices are essentially complete.