FROM WHAT we have seen in the preceding chapter, the most likely answer to the above question is: by a large number of chemical steps that had a high probability of taking place under the prevailing conditions. Alternative explanations, such as instant creation or the intervention, at some stage, of a fantastic stroke of luck, cannot be excluded as long as the postulated steps have not been identified; but they are heuristically sterile and unsupported by what is known of the nature of life.
The details of the life-generating pathway still elude us and may do so for a long time. But they are not hidden in total darkness. First, we have a pretty good idea of what the starting and ending points were. The former consists almost certainly of the amino acids and other organic materials that arise spontaneously in various parts of the cosmos. To believe otherwise would stretch the boundaries of likelihood excessively, considering the close chemical kinships that exist between those substances and biological constituents and considering their apparent ubiquity. As to the ending point, it is represented by the common ancestor of the whole living world, most likely, as we have seen, a primitive bacterium already endowed with all the basic properties that characterize present-day life.
We know the beginning and the end. But that is not all. We actually know one way of getting from one to the other by natural means. It consists of the universal mechanisms whereby life makes more life on Earth today. A number of investigators engaged in origin-of-life research believe this information to be irrelevant. Prebiotic chemistry, they feel, must have been very different from biochemistry. This is most likely true for the cosmic chemistry to which synthesis of the starting building blocks is attributed. But at some stage, the initial chemistry must perforce have given place to biochemistry. My reasons for assuming, against a widely held opinion, that this transition took place early, rather than late, will become clear as we progress in our analysis of the problem. In the meantime, let us start with something on which virtually everyone agrees.
In the first chapters, attention was drawn several times to the central position of RNA in the blueprint of life. In all known living beings, genetic information flows from DNA to RNA to proteins. Striking in this sequence is the uncircumventable position of RNA, which is the obligatory intermediate in the expression of every bit of genetic information stored in DNA. This expression occurs invariably by transcription of the DNA text into the corresponding RNA, the DNA itself being essentially inert from the functional point of view. In a small number of instances, the transfer of information stops there. The RNA transcript plays a functional role by itself, as a ribozyme, or catalyst of a reaction. Most often, the RNA acts as a messenger. It instructs the synthesis of a protein, which, itself, by its structural qualities, or by its enzymatic properties, or by both, plays in the organism the role governed by the transcribed DNA segment.
It is striking and, no doubt, significant that the protein synthesis machinery actually contains RNA molecules as essential components. These are, in addition to messenger RNAs, the ribosomal RNAs, which are key catalytic constituents of the particles (ribosomes) on which proteins are assembled, and the transfer RNAs, those remarkable molecules that serve both to provide amino acids to ribosomes in a form suitable for the assembly of proteins and to read, by anticodon-codon interactions, the instructions borne by the messenger RNAs.
Compared with these crucially important functions, those fulfilled by DNA would seem to be rather minor, being restricted to the storing of information in a replicatable (and transcribable) form.1 In reality, this function could very well be carried out by RNA itself, which we have seen can be replicated, similarly to DNA and according to the same kind of complementarity relationships, by some viral enzymes. This does not mean that DNA is useless. Its dominant presence in all living beings is sufficient proof of its indispensability. But what is eloquently suggested by the facts is that RNA preceded DNA in the development of life and played for a while the role of replicatable repository of genetic information carried out today by DNA.
It seems likely that RNA preceded proteins as well, considering the importance of the functions accomplished by RNAs in protein synthesis. Here, however, a clarification is in order. Proteins, as we have seen, are made from 20 different kinds of amino acids, which are the same in the whole living world. Now, many other amino acids exist, even in the products of cosmic chemistry and in those of the simulation experiments that made Stanley Miller famous. Some of these amino acids are found in biological substances other than proteins, sometimes even linked by peptide bonds of the kind that serve to join amino acids in proteins. There thus has happened, at some stage in the development of protein synthesis, a sort of selection that retained certain amino acids as building blocks for the RNA-dependent machinery and excluded others. We shall see later how this selection could be explained. Let us just, for the time being, remember that a distinction must be made between peptides and proteins. Peptides comprise all the substances, including proteins, consisting of amino acids joined together by peptide bonds. Proteins represent a subset of peptides, containing molecules of large size constructed exclusively with the 20 so-called proteinogenic amino acids, for which there are codons in the genetic dictionary. When proteins are said to have been preceded by RNA, it is that subset that is referred to, not the complete set of peptides. Indeed, it is very possible—I tend to say probable—that certain peptides may have preceded RNA, as will be seen later.
These considerations have led to the notion of an “RNA world,” a term coined in 1986 by the American chemist Walter Gilbert, inventor of one of the first methods for sequencing DNA. According to Gilbert’s definition, the RNA world represents a hypothetical stage in the development of life in which neither DNA nor proteins existed and RNA molecules alone carried out the functions of these two substances. They served as replicatable support for genetic information and accomplished by their catalytic (ribozymatic) properties “all the chemical reactions necessary for the first cellular structures.” This notion has met with enormous success and goes on inspiring numerous experimental attempts aimed at extending by engineering the catalytic capacities of RNAs, which, in nature, are largely restricted to protein synthesis and RNA processing.
We shall see that there are some difficulties with the RNA world as defined by Gilbert. But the foundations of this notion seem indubitable. There is every reason to believe that the emergence of RNA was a crucial step in the development of life, which preceded and most probably determined the appearance of DNA and of proteins. But, before RNA, there must have been something else that prepared and caused the advent of this key substance.
Incipient life, unless guided by a directing principle of the “intelligent design” kind, excluded a priori from our working hypothesis, did not have available the information we possess. It did not “know” it was going to invent RNA and, with it, a new language that would affect the whole history of our planet, perhaps even of the universe. It did no more than blindly follow a pathway imposed by the physical and chemical conditions that prevailed locally. It is not objectionable for us to call on our knowledge of the outcome of those events in our attempt to retrace their course, provided we keep clearly in mind that only efficient causes, not final causes, can have determined them. The problem, it must be acknowledged, is of daunting complexity. Without going into details of chemical structure, let it simply be said that the spontaneous genesis in some “primeval soup” of a molecular arrangement like RNA defies chemical common sense. Indeed, it has so far defied the ingenuity of chemists.
For several decades, some of the best chemists in the world have vigorously addressed the problem of the prebiotic synthesis of RNA. Until now, their efforts, however determined and imaginative their approaches, have not been encouraging. Experts are beginning to lose confidence in an undertaking aimed directly at RNA. They now toy with the idea that RNA may have been preceded in its primordial functions by structurally analogous compounds likely to have arisen more easily.2
Some unconditional supporters of the original version of the RNA world take refuge in the notion of a “flick of chance.” They imagine a few RNA molecules arising somewhere by an almost miraculous combination of circumstances. Such an event would, in their eyes, have been enough for the whole process to be launched, thanks to the ability of RNA to self-replicate and display catalysis. Such a view does not hold water. First, the very hypothesis of RNA arising by some chance event is chemically implausible. Moreover, having a little RNA obviously does not suffice for making more. The term “self-replication” is misleading in this respect, as it confounds two entities: information and synthesis. RNA provides only the former. For the latter, complex building blocks, energy, and strong chemical support are required. These conditions must have been satisfied already at the time RNA first appeared, since this substance could not have been replicated otherwise. They manifestly continued to prevail during all the time—at least centuries, if not millennia or more, as we shall see—when RNA dominated the scene. We are far from the fortuitously stabilized and amplified product of some random fluctuation.
If we follow this reasoning, we arrive at the conclusion that RNA arose in a chemical environment that was already of considerable complexity and included all the elements needed for this event and its perpetuation. It is interesting to recall in this connection the remarkable relationship, already mentioned previously,3 that exists in today’s living world between information and energy. At the heart of both we find ATP and its analogues, GTP, CTP, and UTP.
Indeed, in the synthesis of RNA, those four molecules provide the nucleotide units—AMP, GMP, CMP, and UMP—that make up the building blocks of any RNA molecule. In this reaction, triphosphates (NTPs) become monophosphates (NMPs), the two supernumerary phosphates being released as inorganic pyrophosphate (PPi), while enough energy is made available to support the linking of the nucleotides to each other in the RNA chain.
On the other hand, we have seen that ATP is the universal conveyer of biological energy. What has been mentioned only in passing is that ATP is sometimes replaced in this function by one of its analogues. Thus, GTP fuels the mechanism whereby the messenger RNA tapes are moved through the ribosomes. CTP provides energy for the formation of phospholipids, the main constituents of biological membranes (see Chapter 6), while UTP serves a similar function in the synthesis of a number of complex substances formed from sugar molecules (polysaccharides). And, as just mentioned, the four NTPs also provide the energy for the assembly of RNA (analogous reactions are involved in DNA synthesis).
There can be no doubt: biological energy and information are intimately linked in today’s living world. In all likelihood, this relationship goes back to the very origin of the processes we are attempting to explain. Such being the case, two possibilities may be considered, depending on whether information is taken to have arisen from energy, or the opposite. We shall ignore, for simplicity’s sake, the third possibility attributing the origin of both energy and information to a phenomenon without equivalent in present-day life. This question is rarely discussed. But it seems to me that if one defends the notion of a primitive RNA, fruit of an extraordinary combination of circumstances or of some unknown chemistry that remains to be discovered, the logical implication is to assume that ATP and its analogues arose from RNA and, therefore, that information preceded energy (in its present form). Personally, I find this possibility highly unlikely. Given the need, underlined earlier, for a solid chemical underpinning to support the RNA world during the whole of its long evolution, it seems to me much more plausible to suppose that ATP and its analogues belonged to this underpinning and, perhaps, already served in it as energy vehicles. Consequently, to resolve the RNA enigma, we must go back to the primitive chemistry that functioned, presumably with the help of ATP and its analogues, before RNA existed. What must be searched for first is how some sort of primitive metabolism, a protometabolism, could have arisen spontaneously under prebiotic conditions.
A detailed examination of the chemical reactions that may have composed protometabolism is out of the question. Solid knowledge on this subject is virtually nonexistent, anyway, and the speculations that stand in lieu of it are almost as numerous and varied as the investigators interested in the problem. I shall content myself with a general remark. It expresses a personal and far from widely accepted opinion, which, however, I will try to justify later: protometabolic pathways prefigured the pathways of present-day metabolism. In other words, the signposts mentioned in the beginning of this chapter must be heeded right from the start.
This affirmation, which I have called the congruence principle, implies as an important corollary that present-day metabolism holds traces of the primitive chemistry and could serve as a valuable source of inspiration in the elaboration of theories and, especially, in the design of experiments. Being, unfortunately, past my time for the latter, I must content myself with the former.
The main lesson of metabolism was underlined in Chapter 1 (p. 19): “virtually none [of the reactions of metabolism] would take place if the participating substances were merely mixed together.” It is for this reason that most experts are skeptical of the congruence principle. In their opinion, prebiotic chemistry, not having available the catalysts of biochemistry, could not possibly reproduce the reactions of biochemistry. But one may, instead, wonder whether appropriate catalysts could not have been present in the cradle of life.
Needless to say, the search for possible prebiotic catalysts has always been an important preoccupation of origin-of-life investigators. But their search has, for obvious reasons, been largely restricted to the mineral world; and it has not been entirely fruitless. Clays, in particular, have proved capable of catalyzing the linkage of activated nucleotides into small RNA-like associations, whereas certain iron-sulfur combinations have been found to promote some reactions involving electron transfers. However, nothing comparable to even a very primitive protometabolism has ever been reproduced.
In nature, as we have seen, metabolic reactions are catalyzed mostly by protein enzymes, often acting in conjunction with metals and with organic coenzymes. Catalytic RNAs (ribozymes) are involved to a small extent. In the original RNA-world view of Gilbert, ribozymes are taken to do the entire job. It is, however, obvious that RNAs could not have served as catalysts in a pre-RNA protometabolism. Furthermore, the catalytic properties so far observed with ribozymes are rather limited; they do not show the diversity one would be entitled to expect for a meta-bolism-like system.4 These facts have not, however, damped the ardor of the more enthusiastic supporters of the original version of the RNA world. The possibility that a wider gamut of catalytic RNAs may have existed in prebiotic days has prompted a number of highly ingenious efforts at extending the catalytic potentialities of RNA molecules by bioengineering techniques. These experiments have yielded fascinating results, but their relevance to the origin of life is questionable.
Strangely enough, proteins—or rather peptides, since true proteins must have come later (see p. 59)—have not, by far, enjoyed the same popularity as RNAs as potential prebiotic catalysts. This is surprising, considering the fact that amino acids may have been abundantly present in the prebiotic world, where they could have associated into peptides by relatively simple mechanisms.5 In addition, peptides, being closely related to proteins, are most likely to include molecules with catalytic properties similar to those of protein enzymes.
On the basis of these considerations, I proposed, a number of years ago,6 that the catalysts of protometabolism may have been peptides, or, rather, multimers, as I have called them to indicate that they could have contained substances other than amino acids but chemically close to them, for example, hydroxy acids. An objection to this hypothesis is that the postulated molecules would probably have been too small to display the required catalytic properties. But this objection is not necessarily valid since, as will be seen in the next chapter, the first protein enzymes were probably quite short, little more than about 20 amino acids long. This indicates that peptides of such short length, perhaps even shorter, may be endowed with catalytic activities, rudimentary to be sure, but sufficient to serve as primitive enzymes. Another objection is that a mixture containing all the required catalysts, assuming it had arisen by some chance circumstance, is not likely to have been faithfully reproduced for a long enough time without some replication mechanism. This objection, however, applies to any model of pre-RNA protometabolism, which would be subject to the same constraints. Environmental stability is a common condition of all models postulating a natural development of life.
The fact remains that the multimer hypothesis is no more than a conjecture and will remain so as long as it has not been subjected to experimental testing. This has become possible. Techniques now exist for the preparation of mixtures containing a large number of peptides of different structure. It would be possible to look for enzyme-like activities in such mixtures. This is what I would do if I were 20 years younger.
Leaving aside the question of mechanisms, let us return to the central notion, based on the congruence principle, of a metabolism-like proto-metabolism. The assumption is that ATP and other NTPs somehow arose—the details of possible reactions are beyond the scope of this book—as products of this protometabolism and became integral parts of it, possibly participating in reactions that prefigured their future bio- energetic role. It would not be surprising in such a context if some of the NTPs reacted together to make RNA-like associations.7 This, it should be noted, would be a purely chemical reaction, explainable simply by the presence of a suitable catalyst. For the associations to be authentic RNAs, there would have to be intervention of a template molecule interacting with the catalyst so as to dictate, by base pairing, the choice of the reacting NTPs. UTP would be selected in front of A in the template, CTP in front of G, GTP in front of C, and ATP in front of U (see Chapter 2). Easy to imagine, you might say. But watch out! Here is where hindsight can be dangerously misleading.
Why just A and U, G and C? The possibility that chemical determinism happened to be such as to single out those two pairs of complementary bases smacks perilously of pre-determinism. Do we have to assume that “intelligent design” prepared the way to information transfer by guiding the atoms to combine in just the kind of molecules that allow pairing? Not necessarily. It seems much more likely, if, as would be expected, relatively unspecific chemistry was involved, that a whole array of kindred molecules8 were produced besides the four canonical bases. Molecules of this kind exist today in living organisms.9 Rather than endowing prebiotic chemistry with prophetic insight, it seems more probable that it indiscriminately made a variety of compounds of the same kind, including their NTP derivatives, and that, in turn, the RNA-like products of NTP combination included a “gemisch” of many different assemblages. If this is what happened, all we need is a couple of trivial assumptions, and the RNA “miracle” is explained.
Just imagine—surely a plausible possibility on a purely statistical basis—that a few molecules in the gemisch happened to contain, like authentic RNA, no other bases than A, G, C, and U. If such molecules could interact with the catalyst responsible for the assembly reaction in the manner postulated above, then complementary molecules likewise containing only the four canonical bases would be formed. These molecules, in turn, could induce the reproduction of the original molecules, and so on. Continuation of this phenomenon would progressively lead to the formation of an increasing number of complementary molecules of both kinds. What we have is selective replication and amplification of the rare true RNAs present in the mixture.
This mechanism thus accounts in one shot and without calling on any special intervention, whether of chance or of the deity, for the birth of RNA and for its first replication. As proposed, RNA no longer arises as the product of an almost miraculous event. It is formed by chemistry, as required. But it becomes dominant thanks to a new process, molecular selection, based itself on replicatability. This was a decisive turning point in the development of life. Until then, chemistry was solely in charge. To be sure, continuity was guaranteed by the strict determinism to which chemistry is subjected; but it was, for the same reason, exposed to the vagaries of the environment. With the advent of replication, the faithful reproduction of molecules became possible even under changing environmental conditions. The first seed of genetic continuity was planted.
But there is more. Primitive replications were no doubt very imprecise, continually producing imperfect replicas of the models. Among these faulty copies, there must have been some that, for various reasons, were more resistant to degradation than the originals or were replicated faster than them by the catalyst responsible for the synthesis of the first RNAs. In both cases, the molecules concerned tended to become more abundant than the others. As a consequence, the initial RNA mixture arising from the first products of prebiotic chemistry was to become progressively dominated by RNA molecules that combined stability and replicability in optimal fashion.
This is not just a theoretical vision. The molecular selection of RNA can actually be reproduced in the laboratory.10 This feat was accomplished for the first time in the 1960s by an American biochemist, Sol Spiegelman, and has since been repeated under various conditions by a number of investigators, among them the German chemist Manfred Eigen, who has made a particularly detailed study of the phenomenon. These investigations have clearly established that the mechanism involved does, indeed, consist of a molecular selection entirely ruled by the combined criterion of stability-replicability of the molecules.
This mechanism, it should be emphasized, represents at the molecular level exactly that imagined by Darwin to account for biological evolution: diversification by modifications of the material responsible for hereditary continuity, natural selection of the modified forms most apt to survive and multiply under prevailing conditions, and amplification of those forms. But molecules, not organisms, are selected in this way, with RNA as first fruit of this fundamental mechanism.
The molecular selection of RNA taking place under the conditions of the prebiotic era must have led in the end to a dominant sequence that henceforth remained unchanged—the one combining stability and replicability optimally for those conditions—accompanied by a continually shifting cohort of sequences modified by replication accidents. Eigen has called such a mixture a “quasi-species.” He has arrived, by investigations too specialized to be described here, at the conclusion that the dominant molecule in the quasi-species formed by the first RNAs, the “UrGen,” or original gene, probably corresponded to the ancestor, as identified by molecular phylogeny analyses (see Chapter 7), of the whole family of transfer RNAs. It will be seen that this identity could be highly significant.
The hypothetical scenario just sketched out—or any other obeying the same criteria—shows how incipient life could have entered a phase that could rightly be called “RNA world,” though not—at least, not yet—an RNA world supported by RNA catalysts, as proposed by Gilbert, which it obviously could not be at birth. RNA could not have served originally to make RNA. Whether it ever did cannot be excluded but is so far entirely unsupported by evidence. What seems highly probable, on the other hand, is that RNA served to make proteins. This will be the subject of the next chapter.