Many four-year-olds delight in asking “Why?” and when one offers an explanation, respond with another “Why?” requesting explanations of explanations until explanatory exhaustion. Aristotle and Thomas of Aquinas used the threatened infinite regress of causes of causes to demonstrate the existence of an unmoved mover, but a child recognizes the game could go on forever.
The central conceit of The Life and Opinions of Tristram Shandy, Gentleman (Sterne 1767) is that the novel sets out to recount the life of its hero but, to place him in context and to explain the causes of his character, the text wanders with digressions, and digressions on digressions, with a regress of causes and their causes so that, in the end, we learn little about the eponymous Tristram. A seminal event occurs at the moment of Tristram’s conception, when his mother asks his father, “Pray, my Dear, have you not forgot to wind up the clock?” This question “scattered and dispersed the animal spirits, whose business it was to have escorted and gone hand in hand with the HOMUNCULUS, and conducted him safe to the place destined for his reception.” Many oddities of Tristram’s character stemmed from this minor, but far from inconsequential, perturbation.
It is not implausible, indeed it is probable, that whatever my father and mother were thinking during the consummatory act before my conception had an influence on their posture and on which of the myriad sperm in my father’s ejaculate won the race to the ovum in my mother’s oviduct. “Replaying the tape of life” retells a story in every detail because the sequence of causes remains unchanged. But the first time the tape was played, there was no way of knowing, until it happened, which of my father’s sperm would fertilize my mother’s egg. There is a single causal narrative out of the past but a beyond astronomical proliferation of possibilities into the future. One can explain with much greater confidence than one can predict.
Taking a step further back from my conception, a complex convergence of molecular events determined the location of chiasmata in the spermatocyte that gave rise to my haploid paternal progenitor. If any one of thirty-odd chiasmata had occurred a mere megabase to either side, then the child conceived would not have inherited my particular set of genes, and the same will have been true of the conception of every one of my ancestors. But a molecular explanation of the location of untold chiasmata would comprise only an infinitesimal part of what a complete causal account of my ancestry would entail. My father’s father was an ambulance driver at the second battle of Villers-Brettoneux. So, an account of his survival, where so many others died, would need to explain the trajectories of innumerable projectiles and their fragments, and so on to the endlessly disputed causes of the First World War.
The point of this reductio ad absurdum is that, while all evolutionary processes are, in principle, reducible to physical causes, no feasible account can be causally complete. Every story needs a place to begin which leaves many things unsaid. So too, all scientific explanations include items that, for present purposes, are accepted without explanation.
It were infinite for the law to judge the cause of causes, and their impulsions one of another; therefore it contenteth itself with the immediate cause, and judgeth of the acts by that, without looking to any further degree.
—Francis Bacon (1596)
In pre-Classical Greek, aition and aitia had connotations of responsibility, guilt, blame, and accusation (Frede 1980; Pearson 1952). Aristotle’s aitia was translated as classical Latin causa, a word that could refer to a lawsuit as in nemo iudex in causa sua (“No man should be a judge in his own cause”). English cause was adopted from medieval Latin around 1300 and retains legal uses as in probable cause. A similar association of cause and culpability occurs in Germanic languages. German Ursache (cause, reason, or motive) is related to Anglo-Saxon sake as in for the sake of. “Sake” could refer to a lawsuit, complaint, accusation, or guilt. Thus, concepts of cause appear to have evolved from proto-legal notions of blameworthiness. A cause was something that could be held responsible.
Aristotle recognized four kinds of aitia; traditionally translated as material, efficient, formal, and final causes. Bacon (1605) embraced material and efficient causes as the proper domain of physics but banished formal and final causes to the realm of metaphysics. Aristotelian pluralism was supplanted by a monistic concept of causation of which efficient cause was the dynamical aspect and material cause the physical substrate. In the new mechanical philosophy, form lacked independent potency but was “confined and determined by matter.” Final causes were disparaged as an encumbrance to the advancement of learning, as “remoraes and hindrances to stay and slug the ship from further sailing.”
The fundamental incompleteness of all causal stories has coexisted with faith in explanatory reduction because of scientists’ confidence that a physical explanation could, in principle, be given of the things that are left unexplained in each particular causal account. For logical consistency, it should be scientifically and philosophically legitimate to invoke things that look like formal or final causes if these could, in principle, be explained by physical and material causes. The original intent of the paper that became this chapter was to defend the use of formal causes (information) and final causes (functions) in evolutionary explanation, but my purposes evolved in the writing. Formal causes will be presented as abstractions of material causes and final causes as efficient ways of talking about efficient causes. Form can be grounded in material cause because the matter of evolved beings possesses intricate fine structure that embodies experience of what has worked in the past. Purpose can be grounded in efficient cause because current means are explained by past ends via the recursive physical process we call natural selection.
Let’s think of eggs.
They have no legs.
Chickens come from eggs
But they have legs. The plot thickens:
Eggs come from chickens,
But have no legs under ’em.
What a conundrum!
—Ogden Nash (1936)
Consider a causal chain: A causes B causes C causes D causes E. Prior things cause posterior things. C is an effect of A and B but a cause of D and E. So much is simple. But what happens when things recur? . . . Ai–1 causes Bi–1 causes Ci–1 causes Di–1 causes Ei–1 causes Ai causes Bi causes Ci causes Di causes Ei causes Ai+1 causes Bi+1 causes Ci+1 causes Di+1 causes Ei+1, with the recursion continuing into the indefinite past and indefinite future. Tokens of each type occur both before and after tokens of each other type. A token is either cause or effect of another token—it cannot be both—but cause and effect are inextricably entangled once one attempts to generalize and describe lawful relations among types. Types are both causes and effects of each other (and of themselves). A linear causal chain was chosen for simplicity of exposition but similar arguments could be developed for multidimensional causal webs.
“Self-evident” distinctions between cause and effect are far from obvious in recursive processes. As one moves back along a chain of physical causation, one encounters things that resemble things to be explained. Eggs produce chickens and chickens produce eggs. Genes are causes of phenotypes and phenotypes causes of which genes replicate. What sound is input and what sound is output when an amplifier feeds back?
A phenotypic effect (P) may be viewed as both a cause and consequence of a genotypic difference (G) when both are considered as types. A complete causal account of Pi (subscripts indicate tokens) would include many prior occurrences of P plus many prior occurrences of G and would resemble a complete causal account of Gi. If Pi–1 causes Gi causes Pi causes Gi+1, then it is a matter of preference whether P is considered the cause and G the effect or the other way round. A molecular biologist argues from G to P when explaining how gene expression determines phenotype whereas an evolutionary biologist argues from P to G when explaining why a gene has its particular effects. The former mode of explanation is commonly accepted as unproblematic whereas the latter is rejected as teleological and unscientific. But this is no more than a convention of scientific storytelling. Phenotypes are among the efficient causes of genotypes (the central dogma of molecular biology notwithstanding).
Two other points are worth making briefly. First, a recursive non-equilibrium system must be thermodynamically open because a closed system cannot return to an earlier state (the entropy of the closed system increases until thermal equilibrium). Second, evolution requires heritable imperfections of recursion or nothing can change.
Information can exist only as a material pattern, but the same information can be recorded by a variety of patterns in many different kinds of material. A message is always coded in some medium, but the medium is really not the message.
—George C. Williams (1992)
Most eukaryotic genomes harbor retroelements that replicate their DNA via RNA intermediates or, what amounts to the same thing, replicate their RNA via DNA intermediates. Nothing structural persists in this process. DNA is “copied” into RNA and then RNA is “copied” into DNA at a new location in the genome (Finnegan 2012).
An LTR retrotransposon can serve as a paradigm. In its guise as double-stranded genomic DNA, the retrotransposon is transcribed by host-encoded RNA polymerase from an antisense-strand of DNA into a sense-strand of RNA. The resulting RNA can have two functional fates: it can be processed into a messenger RNA (mRNA) that is translated by ribosomes into gag and pol proteins; or it can be used as genomic RNA that is packaged with pol and gag proteins as an infective particle. Pol is a remarkable gadget: acting as a reverse transcriptase, pol synthesizes an antisense-strand of DNA complementary to the genomic RNA; acting as an RNAse, pol degrades the RNA template; acting as a DNA polymerase, pol synthesizes a sense-strand of DNA from the antisense-strand; and acting as an integrase, pol inserts the double-stranded DNA into a new site in “host” DNA (Finnegan 2012). A sense-strand of RNA can be used as a template to make proteins (translation) or antisense DNA (transmission), but the same copy cannot perform both functions.
Retrotransposons trace their origins back before the beginning of cellular life, but an active retrotransposon cannot reside long at any one place in the genome. At each location its DNA is inserted, natural selection favors mutations that inactivate and degrade retroelement functions because retrotransposition is costly to organismal fitness. Nevertheless, retrotransposition persists, because reverse-transcribed DNA inserts at new sites faster than mutations degrade source DNA. Mutations that enhance transposition disperse to new sites while mutations that reduce transposition accumulate at old sites. An active element must stay one jump ahead of inactivating mutations. It is a restless wanderer, leaving crumbling genomic footprints at each step along the way (Haig 2012, 2013).
Retrotransposition involves changes in substance and material form. Consider a nine-nucleotide segment of gag. 5′–CGCACCCAT–3′ (antisense DNA) is transcribed into 5′–AUGGGUGCG–3′ (RNA), which can be translated as methione–glycine–alanine (peptide) or reverse transcribed as 5′–CGCACCCAT–3′ (antisense DNA). The latter is then used to synthesize 5′–ATGGGTGCG–3′ (sense DNA). Sense and antisense DNA differ, not only in the use of complementary bases, but also because complementary bases occur in reverse order relative to the sugar-phosphate backbone because of antiparallel pairing. Sense DNA and RNA differ in the substitution of thymine (T) for uracil (U) and in the use of deoxyribose rather than ribose in the backbone. RNA and peptide are chemically chalk and cheese.
The above paragraph was written to disconcert readers familiar with the conventions of representing Watson–Crick base pairing because all nucleotide sequences were written in the direction 5′ to 3′ (the direction of synthesis of the individual strand). Sense and antisense sequences are synthesized in opposite directions with the 5′ end of one complementary to the 3′ end of the other (antiparallel pairing). For this reason, one sequence is usually represented in the 5′ to 3′ direction and the other in the 3′ to 5′ direction so that the complementary bases (A with T and G with C) occur at the same relative position in the represented sequences. My aim in violating the representational convention was to emphasize the chemical differences between a sense sequence and its interpretation as an antisense reverse complement.
Many things within cells are made of DNA, RNA, or protein. Many RNAs are transcribed and many proteins translated. What allows us to pick out a retrotransposon as a nameable entity from these other components and activities? What thing can be held responsible? The retrotransposon is distinguished from other cellular components because it possesses distinct criteria for evolutionary success. Sense DNA, antisense DNA, sense RNA, and peptide are linked by complex causal dependence but are structurally unrelated. Each can be considered to represent the others as material avatars of an immaterial gene. The “information” that is the retrotransposon must repeatedly change substance and location to persist in an unbroken chain of recursive representation. Representation presents again in different form, but prior forms are present again when forms recur. There is, in principle, a complete causal account that invokes nothing but efficient and material causes, and in which there is recurrence without continuity of any material thing, but one cannot give a meaningful account of a retrotransposon without reference to its telos and eidos. The forms are shadows of shadows.1
If it be true that the essence of life is the accumulation of experience through the generations, then one may perhaps suspect that the key problem of biology, from the physicist’s point of view, is how living matter manages to record and perpetuate its experiences.
—Max Delbrück (1949)
Medieval Latin informatio referred to molding or giving form to matter (Capurro and Hjørland 2003): a potter informed the clay. Anglo-Norman informacione (13th cent.), however, was a criminal investigation by legal officers. Metaphors of information abound in modern biology. Not everyone who uses them is a fool. There must be meaning behind the metaphors, but precisely what that meaning is has been difficult to pin down. Max Delbrück wrote that “unmoved mover” “perfectly describes DNA; it acts, creates form and development, and is not changed in the process” (1971, 55). Biological information, whatever that may be, performs an explanatory role similar to Aristotle’s eidos (Grene 1972).
An evolutionary distinction between information and objects in which information resides has often been made. It appears in contrasts between replicators and vehicles (Dawkins 1976), information and its avatars (Gliddon and Gouyon 1989), codical and material domains (Williams 1992), and my distinction between informational and material genes. In my formulation, material genes are physical objects, but informational genes are the abstract sequences of which material genes are temporary vehicles. I have previously identified material genes with gene tokens and informational genes with gene types, but the latter is not quite right if “type” is interpreted as a material kind. Sense DNA, antisense DNA, RNA, and protein all represent an informational gene but are not molecules of one kind. Continuity resides in the recursive representation of immortal pattern by ephemeral avatars.
Shannon information quantifies the reduction of uncertainty for a receiver observing a message relative to other messages it could have been. The larger the set of possible messages, the greater the reduction in uncertainty. Perhaps a better formulation would be to say that information measures the reduction of uncertainty of an interpreter observing one thing rather than other things it could have been. The interpreter uses the observation to select an interpretation from a set that matches possible interpretations to possible observations. In this formulation, a message (or text) corresponds to the special case of information sent with intent, but an interpreter can also observe the environment or things intended to be hidden.
A human genome contains 3.2 gigabases (Gb) with up to two bits of information per base (a choice from four alternatives). Therefore, a human genome contains 6.4 gigabits of information relative to the set of all possible 3.2 Gb strings. This is the reduction in uncertainty provided by a particular sequence for an interpreter who had no prior knowledge other than the length of the sequence. Every 3.2 Gb string contains the same information but most strings cannot be meaningfully interpreted (Moffatt 2011; Winnie 2000). Only an infinitesimal subset of the library of all 3.2 Gb sequences contains genomes that have ever existed (Dennett 1995). Other measures of Shannon information might compare the sequence to the set of all extant human genomes or to the set of all past genomes. The amount of Shannon information depends on the background knowledge of the receiver.
Information and meaning are distinct. A DNA sequence contains information that acquires meaning when the sequence is interrogated for answers to particular questions. One might use it to determine the amino acid sequence of an otherwise unknown protein or to search for the cause of genetic disease in a patient. Genomes contain clues about evolutionary history if we can only read the hints. If an individual carries the Benin sickle-cell S haplotype, then we can infer that he or she had recent ancestors who lived in West Africa and survived malaria. Other inferences can be made by comparing sequences. We compare DNA documents to reconstruct phylogenetic trees, to date times of divergence, to infer ancestral population size, or to locate regions of positive selection.
Information has meaning for an interpreter when it is used to achieve an end. The proximate end of the interpretative process is an interpretation of the information. Interpretation of one thing as another differs from simple change of one thing into another, because interpretation has an intended end. An interpretation is intended for use, but an uninterpreted change simply occurs. This account of meaning can be viewed as parallel to C. S. Peirce’s (1877) account of belief. His trinity of belief, desire, and action—“our beliefs guide our desires and shape our actions” (5)—can be loosely translated as my triad of meaning, end, and interpretation. For Peirce, beliefs were habits of mind that guided action: “Belief does not make us act at once, but puts us into such a condition that we shall behave in some certain way, when the occasion arises” (6). In other words, beliefs were latent information whose meaning was expressed in conditional action to achieve a motivated end.
Meaning resides in the interpretation, not in the information, because the same information can mean different things to different interpreters. A sender may intend a particular interpretation and have constructed a message accordingly, but the recipient determines how the message is interpreted. Subsequent interpreters of a message may obtain more, or less, information than was intended by the sender.
Meaning is extracted from a DNA sequence, represented in the output of an automatic sequencer, when a technician reads T rather than A and infers that a fetus will express hemoglobin S. The technician’s end is clinical diagnosis. Meaning is extracted from the same DNA sequence, represented as an RNA message, when a ribosome incorporates valine rather than glutamate into a β-globin chain. The ribosome’s end is protein synthesis. Selectively neutral single-nucleotide polymorphisms have meaning for a geneticist who uses them to isolate a disease-causing gene but no meaning for the organisms from which they come. No meaning is extracted when DNA is eaten by a bacterium. The use of something as an object (throwing a stone), rather than as a representation (reading a stone tablet), does not count as use of information.
A pause is in order. A thing contains information when it differs from something else it could have been. Two things contain mutual information if an observer can learn something about one by observing the other. This is a symmetric relation. An effect represents its cause when observation of the effect allows inference of the cause. This is an asymmetric relation: Xi represents Yi to the extent that Yi is causally responsible for their mutual information. A thing has meaning for an interpreter when its “difference from something else” is used by the interpreter to achieve an end. An interpretation is a representation of the information used by the interpreter.
An interpretation can be a text interpreted by another interpreter. Interpretation is recursive when interpretations return to prior forms. X and Y, considered as types, reciprocally represent each other if the token Xi represents Yi represents Xi–1 represents Yi–1. Replication is reliable, high-fidelity recursion of interpretation. (The game of “Telephone” shows what happens when representation is unreliable.) The text of a replicator is an interpretation of itself.
Living things are replete with reliable reciprocal representation. Each strand of the double helix represents the other. A messenger RNA (mRNA) represents the DNA from which it is transcribed, and the DNA represents the mRNA. A protein represents the mRNA from which it is translated, and the mRNA represents the protein. DNA represents protein, and protein represents DNA. Extended phenotypes represent genotypes, and genotypes represent extended phenotypes (Dawkins 1982; Laland et al. 2013a). All represent what has worked in past environments. Natural selection creates complex causal dependence between past environments and processes within cells.
Life is made meaningful by a multitude of mindless interpreters reinterpreting the molecular metaphors of other mindless interpreters. RNA polymerases transcribe DNA as RNA. tRNAs interpret codons as places to deposit amino acids. Ribosomes translate RNA prose into protein poetry. Higher-level interpreters depend on the activity of myriads of lower-level interpreters. Islet cells integrate blood glucose and other inputs to regulate insulin. Fat cells, muscle cells, and liver cells interpret insulin for diverse ends. Neurons respond to signals from muscles and muscles to signals from neurons. Brains comprehend social relations. You read this sentence. Organisms are self-constructed interpreters of genetic texts in environmental context.
The environment chooses phenotypes and thereby chooses genes that represent its choices and embody information about the environment’s criteria of choice. Observation of these choices would reduce the uncertainty of an omniscient observer about which genes will be transmitted to future generations. The choices of the environment are unintended, but actions that are repeated because of their effects are thereby intended. The choices of the environment are not themselves messages, but genes that represent these choices are copied and passed on as messages from one generation to the next (Bergstrom and Rosvall 2011). Organisms and their lower-level parts are senders and interpreters of these texts.
A difference is a very peculiar and obscure concept. It is certainly not a thing or an event.
—Gregory Bateson (1972)
A soldier fires at Marius but Éponine blocks the shot with her body, saving Marius’s life. The soldier’s choice, the difference between firing or not firing, makes no difference as to whether Marius survives but does make a difference as to whether Éponine survives. Éponine’s choice, the difference between lunging forward or holding back, makes the difference between Marius’s death or survival. The soldier’s shot is responsible for Éponine’s death and Éponine’s death is responsible for Marius’s survival, but the soldier’s shot is not responsible for Marius’s survival. Responsibility is not transitive.
Things or events do not make a difference; differences between things or events make a difference. One cannot decide whether something is responsible for an outcome without answering the question, compared to what? A choice is an act that could have been otherwise and may make a difference.
A physician gives morphine to a patient dying of cancer. The difference between a fatal and nonfatal dose does not make a difference between the patient dying or not dying, but does make a difference between the patient dying a painful or nonpainful death. If I tell you the dose of morphine I do not provide any information about whether the patient lives or dies but provide information about the nature of the death. If the patient does not die from an overdose, then the patient dies from cancer. In the philosophical literature, this is known as causal preemption (Hitchcock 2007).
There is a close connection between concepts of information and causation. Gregory Bateson (1972) defined the unit of information as a “difference which makes a difference,” but his phrase could also be used as a definition of causation with the first difference as cause and the second as effect. (William Bateson, who coined the word “genetics,” named his third son after Gregor Mendel.) In the words of Ronald Fisher, “To the common sense of mankind it is the property of a cause, qua cause, that it might have been different and have had different effects” (1934, 106). Observation of either difference contains information about the other. This information is potentially about the relation between cause and effect, but use of the information requires an interpreter that has either been designed or evolved for that end.
Consider again the nine-nucleotide segment of gag antisense DNA. When 5′–CGCACCCAT–3′ is transcribed as 5′–AUGGGUGCG–3′ by an RNA polymerase, every DNA nucleotide makes a difference in the resulting RNA. RNA polymerases are instructed by DNA sequences in which every nucleotide conveys actionable information: A means “choose U,” C means “choose G,” G means “choose C,” and T means “choose A.” Once transcription is initiated, and until it terminates, RNA polymerase interprets every A, C, G, or T as U, G, C, or A regardless of the context of surrounding nucleotides. Each and every change in the DNA nucleotide sequence would cause a change in the RNA sequence (given a properly functioning RNA polymerase).
Ribosomes translate 5′–AUGGGUGCG–3′ as methione–glycine–alanine. They are more sophisticated interpreters than RNA polymerases because the meaning of bases for ribosomes is determined by context. The AUG triplet communicates crucial information. It is the symbol “start here with methionine” that initiates most polypeptides and sets the reading frame for translation of the rest of the message in triplets. AUG in the body of an mRNA (when in the correct reading frame) simply means “choose methionine.” The two meanings are distinguished by context.
G appears five times in the sequence of nine bases. The G in AUG is essential for meaning “choose methionine” because any other base in that position would result in a different amino acid added to the polypeptide. The two Gs in GGU taken together mean “choose glycine” because GGC, GGA and GGG are also interpreted as glycine by the ribosome. The first G in GCG means “choose alanine” in the context of C in the second position, because any other base in the first position would be interpreted as a different amino acid; the G in the third position does not make a difference and could be replaced by any other base and still be interpreted as alanine; but a deletion of the third base (a difference between nobase and somebase) would shift the reading frame and change the interpretation of the rest of the message.
RNA polymerases and ribosomes choose from ensembles. When an RNA polymerase transcribes G, it picks out a C from a cytoplasmic mixture of U, C, A, and G. When a ribosome translates AUG, it selects a tRNA charged with methionine from a mixture of tRNAs charged with all twenty amino acids. Methionine is the bon mot the ribosome seeks to capture the meaning of AUG. AUG is present in this position in the RNA message because it has competed, and will compete, with alternatives such as ACG or UUG that have different denotations for the ribosome and different connotations for the organism. Natural selection among variant texts chooses those that are useful and discards the dross. By this means, the macrolevel of ecology and social interactions informs the microlevel of molecules.
Some changes to an RNA message change the amino acid added to the growing polypeptide—these are differences that make a differance in the translated protein—whereas other changes are synonymous and make no difference in translation. The choice of a particular amino acid at a particular location in a protein may have no effect on protein function, in which case different codons are meaningful for the ribosome but meaningless for the organism. For such a “neutral” substitution, the difference in the mRNA, and in the DNA from which the message was transcribed, causes a difference in the protein but does not cause a difference in fitness. The choice of amino acid by the ribosome is purposive, but the choice of nature is random.
A choice is a difference that makes a difference. It is a branch point at which a traveler could have taken another path but, once a path is chosen, the chosen path informs an observer of the traveler’s choice. Information about what befalls on a path would be useful in making a choice if the traveler ever came that way again. If travelers copy their choices for later reference, and death awaits on one path but safety on another, then choices of the wrong path never return to the fork in the road, but choices of the right path return to make the same “wise” choice again. In a perilous maze, the records of surviving travelers provide a safe guide for finding a way.
Choices are degrees of freedom. The meanings of information are the choices it guides. Information is useful if, and only if, it changes the future for the better. By tortuous paths, we have come to view choice as synonymous with cause and information as a potential guide of choice. Given a textual record of recurring choices, Darwin’s demon (Pittendrigh 1961) culls the bad choices and retains the good. Well-informed choice is purposive difference-making.
It follows that there are several causes of the same thing. . . . And things can be causes of one another, e.g. exercise of good condition, and the latter of exercise; not, however, in the same way, but the one as end and the other as source of movement.
—Aristotle, Metaphysics
Teleological language in biology appears in a heterogeneous class of explanations united by the loose property that a thing’s existence is explained by an effect that the thing makes possible. A beaver grows sharp incisors to cut down trees to build a lodge to provide shelter from the storm. Dental development has the goal of sharp incisors with the function of cutting down trees for the sake of building a lodge for the purpose of shelter, all for the good of the beaver. “In order to gain access to buried stretches of DNA inside nucleosomes, a chromatin remodeling ATPase is required to unwrap the nucleosomal DNA” (Mellor 2005, 147) is no less teleological than “the hairs about the eye-lids are for the safeguard of the sight” (Bacon 1605/1885, 120).
A final cause explains something by its effects. The thing exists for the sake of an end. In the absence of conscious intent, such explanations have been rejected because explanandum precedes explanans. However, this argument loses force for products of natural selection because endsi can be causes of meansi+1 without backward causation. A thing exists today because similar things in the past had effects that enhanced survival and reproduction. The thing expresses similar effects in the present because its effects are heritable. Therefore the thing considered as a type exists because of its effects.
Ends can be means to other ends. Ayala (1970) distinguished proximate ends, the functions or end-states a feature serves, from the ultimate goal of reproductive success. Most biological research addresses the end-directedness of adaptations to achieve proximate ends without explicit reference to ultimate goals. The proximate ends of the mindless interpreters described in previous sections are interpretations of information from the environment or sent as genetic texts. The purposeful behavior of these interpreters can be explained as the outcome of selective processes that incorporated information about what worked in past environments into the fine structure of information-carrying molecules.
Selection means choosing from a set of alternatives. If there is no alternative, there can be no choice. In Darwin’s metaphor of natural selection, the environment “chooses” via differential survival and reproduction. In my formalism of this process for genetic replicators, the environment chooses among effects of genes and thereby chooses among genes. An effect is a difference a gene makes relative to some alternative. It is not a property of an individual gene but rather is a relation between alternatives. The selected gene is a difference that made a difference. In this formalism, phenotype (synonymous with a gene’s effects) is defined as all things that differ between the alternatives, whereas environment is defined as all things shared by the alternatives. By these definitions, what is a phenotype in one comparison may be environment in a different comparison. Natural selection will tend to convert phenotype into environment because environment is that for which there is no reasonable alternative. Deleterious mutations are unreasonable choices that are eliminated by “negative selection.” They are difference-making alternatives that are eliminated soon after they occur.
Choices of the environment reduce uncertainty about which genes will leave descendants. The selected genes thereby convey information about these choices to ribosomes and other mindless interpreters in subsequent generations. If the choices of the environment are nonrandom, then the genes embody usable information about the environment’s criteria of choice and guide effective choices of organisms.
A gene is “responsible” for its effects. Changes of allele frequency extract average additive effects on fitness from a matrix of nonadditive interactions (Fisher 1941). Whatever effects of an allele contribute to a positive average effect on fitness can be considered the final causes of the allele’s persistence. A gene’s function can be defined as those of its effects that have contributed positively to its spread and present frequency. All other effects, negative or neutral, are side effects without function. If an effect contributes to a gene’s success—by any route, no matter how devious—then the gene exists for the sake of that end and the end exists for the good of the gene (Haig 2012; Haig and Trivers 1995).
In the struggle for existence in a world of finite resources, one variant’s success comes at the expense of alternatives. The causes of death of individuals without an allele contribute to an allele’s success, just as much as the causes of survival of individuals with the allele. The less-appealing traits of the suitors rejected by my mother in favor of my father comprise part of a complete causal account of how I happen to be writing this essay.
An allele must make a difference in many lives if it is to spread by natural selection, from a single copy arising by mutation in a germ cell to fixation in a population of many individuals. No one event can be singled out as the cause of adaptation, but patterns of events, distributed through space and time, result in adaptive change. Natural selection is not an efficient cause but a statistical summary of many efficient causes.
One must consider not only allelic substitutions (positive selection) but also failures of substitution (negative selection). All adaptations will degrade over time unless mutations that impair the evolved function are weeded out. Each new mutation creates an allelic difference that is subject to selection on the basis of its average effect on fitness. If the mutation is eliminated by a choice of nature, then the difference of phenotypic effect exists for the good of the allele chosen. Many phenotypically interchangeable but genetically distinct loss-of-function mutations can be grouped together into a single allelic difference. In this way, a genetic function, determined by interactions between multiple sites within a coding sequence, can be considered for the good of the evolutionary gene.
Consider the substitution of thymine for adenine in the middle base of the sixth codon of the human β-globin gene. This difference causes a replacement of glutamate by valine at the sixth amino acid position of the β-globin polypeptide. The resulting protein, hemoglobin S, is responsible for sickle-cell disease when homozygous and resistance to malaria when heterozygous. The alternative allele with valine at position 6 is known as hemoglobin A. With respect to the allelic difference between A and S, the function of S is containment of malarial infection in a genotypic environment that includes an A allele. A deleterious side effect of S is life-threatening anemia in a genotypic environment that includes another S allele (Haig 2012).
The prior paragraph deliberately confuses gene and protein. Proteins and genes often share the same name (mutual metonymy). Sometimes a gene is named for its protein and sometimes a protein for its gene. In speech, a gene name often collectively denotes gene, mRNA, and protein as avatars of recursive form.
The sickle-cell mutation has been presented as an exemplar of a “selfish nucleotide” and used to dispute the identification of “evolutionary genes” with DNA (Griffiths and Neumann-Held 1999). The reductio ad absurdum fails because evolutionary genes have been defined as stretches of DNA rarely disrupted by recombination (Dawkins 1976; Williams 1966) and sufficiently short to maintain linkage disequilibrium (Haig 2012). Nonrandom associations of variable nucleotides, some of which may be functional, extend for hundreds of kilobases to either side of the “selfish thymine” (Hanchard et al. 2007). As recombination between sites lessens, and as the strength of epistatic selection increases, a point is reached at which different sites can no longer be considered as belonging to different evolutionary genes (Neher, Kessinger, and Shraiman 2013). For sites sufficiently close together, nonadditive interactions on the axis of expression contribute to an additive effect on the axis of transmission (Haig 2011a; Neher and Shraiman 2009).
Any complex organismal adaptation will involve many allelic substitutions at multiple loci. For ancient adaptations, most substitutions will have occurred in the deep past, in organisms and environments very different from those of the present. In the process, some genes may have been transformed beyond recognition. While each substitution could be considered for the good of that gene at that time, the adaptation serves proximate ends today. For what entity are these ends a good? A standard answer is that complex adaptations are for the good of the organism. A gene-selectionist could counter that a complex adaptation is for the good of each and every gene whose loss of function by mutation results in loss of the adaptation (Haig 2012).
The literature written by [Darwin’s] Demon is no more deducible from a complete command of the nucleotide language, let alone physical law, than the works of Shakespeare or Alfred North Whitehead are deducible from a complete command of the English language.
—Colin Pittendrigh (1993)
Chickens can unscramble eggs by eating them and laying another (Gregory 1981, 137). James Clerk Maxwell (1831–1879) imagined a demon that performed work by choosing which molecules to allow through a partition, thereby selecting ordered subsets from a disordered ensemble. Selection can extract work from randomnesss.
A rocket is a rigid tube, open at one end, that converts the disordered molecular motion of combustion into coherent motion of the tube. Roughly speaking, the closed end of the tube selects molecular momentum orthogonal to its surface and imparts that momentum to the rocket while the open end discards momentum in the opposite direction. The rocket engine is the selective environment that chooses an ordered subset of moving particles from a disordered set as the entropy of the working material increases. A piston selects molecular momentum orthogonal to the one moveable wall of a cylinder and thereby does work while discarding unworkable energy into a heat sink (Atkins 1994, 83). Organisms are elaborate self-assembling engines that acquire or synthesize their own fuel and dump entropic excrement. They are the selective environment by which food is converted to work.
Subset selection is a semantic engine. Consider a set subject to a procedure by which some are “chosen” and others “rejected.” Choice is random if membership of the selected subset is determined by criteria independent of intrinsic properties of things chosen (for example, if no attribute has a periodicity of five but every fifth entity is selected). The disjunction of selected and discarded subsets contains no information about the criteria of choice when choice is random, but the disjunction contains information about the criteria of choice when choice discriminates among members of a set on the basis of one or more of their intrinsic properties (a reasoned choice). The selected and discarded subsets are biased samples of the whole. One might say that one is adapted, and the other maladapted, to the selective environment.
Wind winnows wheat from chaff by the criterion of weight to cross-sectional area. A bird picks berries from a bush on the basis of palatability, and the bird’s criteria of choice are reflected in differences between eaten and uneaten berries. A man chooses a wife and we can infer something about his preferences by comparing his spouse to others who were available but passed over. His choice is restricted to members of a comparison set constrained by the comparison sets and preferences of possible partners. You can’t always get what you want.
Natural selection, it has been said, differs from subset selection because “offspring are not subsets of parents but new entities” (Price 1995, 390). But the genes of the next generation are a subset of the genes of the last. Therefore, natural selection can also be inscribed under the rubric of subset selection if focus shifts from vehicles to replicators, from interpretations to texts. Natural subset selection is indirect. The environment selects a subset of phenotypes to be parents and thereby selects a subset of genes to be transmitted.
Selection from a selected subset retains information from past choices, imperfectly. Retention is imperfect because information is dissipated by random culling, by random mutation of past reasoned choices, and by changes in criteria of choice. In the absence of replication, recursive selection reduces the size of the comparison set at each round of choice. Replication creates redundancy and thus increases the probability that information from past choices will be retained despite dissipative forces.
Mutations are random guesses in the neighborhood of previous choices. Mutation degrades semantic information about past choices but adds entropy for future reasoned choice. For the right balance of mutation and selection, recursive selection of mutable replicators results in accretion of semantic information and refinement of fit to criteria of choice.
Why all this silly rigmarole of sex? Why this gavotte of chromosomes? Why all these useless males, this striving and wasteful bloodshed?
—William D. Hamilton (1975)
Clonal reproduction replicates entire genotypes that are judged repeatedly in the court of environmental opinion. Each asexual genotype is a single “evolutionary gene” responsible for its own average effects after repeated retesting. The difference between genotypes that differ at a single site can be attributed to that site, but responsibility cannot be attributed to individual sites when genotypes differ at multiple sites. Segments of particular value must share credit with segments that do not pull their weight and are hidden from blame. All must share in communal praise and collective guilt.
Sexual genotypes, by contrast, are ephemeral. Judgment of each individual genotype is unique and unrepeated, but smaller segments are tested repeatedly against different backgrounds and can be held responsible for their average effects. Sexual genotypes are pastiche, cobbled together from parts of two parental genomes, four grandparental genomes, eight great-grandparental genomes (you get the idea), in a process that mindlessly breaks up effective combinations for the chance of something better. Every one of these genomes has been tested by the environment and passed. The sexual disassembly and reassembly of genotypes allows attribution of responsibility to parts.
Mendel’s demon (Ridley 2000) is a randomizing agent that shuffles the genetic deck and deals out fresh hands in each round. It can be a mischievous imp that impedes the work of Darwin’s demon by breaking up favorable combinations, or a helpful sprite that rescues parts of promise from bad company. As the genome is diced into smaller pieces, the range of effects for which each nonrecombining segment can be held responsible diminishes (Godfrey-Smith 2009, 145; Okasha 2012), but each segment is more readily held responsible for its causal effects. Darwin’s and Mendel’s demons, working together, create teams of champions rather than champion teams.
Experiment . . . is an uncommunicative informant. It never expiates: it only answers “yes” or “no.” . . . It is the student of natural history to whom nature opens the treasury of her confidence, while she treats the cross examining experimentalist with the reserve he merits.
—C. S. Peirce (1905)
C. S. Peirce (1905) compared an experimental scientist with men whose education had largely been learned from books: “He and they are as oil and water, and though they be shaken up together, it is remarkable how quickly they will go their several mental ways, without having gained more than a faint flavor from the association” (161). His vivid use of metaphor belied his admonition “that no study can become scientific . . . until it provides itself with a suitable technical nomenclature, whose every term has a single definite meaning universally accepted among students of the subject, and whose vocables have no such sweetness or charms as might tempt loose writers to abuse them” (163–164). He contrasted the poverty of the experimentalist’s “meagre jews-harp of experiment” to the richness of the naturalist’s “glorious organ of observation” (175). Despite such a seemingly invidious comparison, the rational purport of belief was to be found solely in answers to repeated experiments and their consequences for future conduct: “If one can define accurately all the conceivable experimental phenomena which the affirmation or denial of a concept could imply, one will have therein a complete definition of the concept, and there is absolutely nothing more in it” (162). Right conduct is choice guided by experience.
Experiments are choices offered to nature for the resolution of doubt. They provide terse, inarticulate answers to narrowly defined questions. These answers are informative when they reduce the experimentalist’s uncertainty about the state of the world. The beliefs they engender have meaning when used to guide conduct. By this means, “thought, controlled by a rational experimental logic, tends to the fixation of certain opinions” that are not arbitrary but predetermined by nature (Peirce 1905, 177).
The experimental method (Peirce’s demon) and natural selection (Darwin’s demon) are resolvers of difference in which choices of nature inform adaptive behavior via the accumulation of useful information. Practice perfects performance by trial and choice. A controlled experiment varies one thing while holding other things constant (ceteris paribus) to determine the differences for which that thing can be held responsible. But experiments must be replicated to average out residual, uncontrolled variation. Sexual recombination achieves a similar statistical control by repeated retesting of allelic differences on different genetic backgrounds. The average effects of allelic differences reduce the complexity of biological interactions to simple binary choices. The success of the experimental method and of sexual organisms suggests that short-sighted choice among recombinable units often outperforms reasoned judgment of integrated wholes.
The histories of causal and legal concepts are closely intertwined. The function of a trial is to determine whether a defendant is responsible for a crime. Many circumstances and opinions are weighed in the balance but the judgment is binary, guilty or not guilty. The earliest known meanings of try are to sift or pick out, to separate one thing from another, especially the good from the bad, and to choose or select. A trial was the determination of a difference, between guilt or innocence, by tribunal, battle, or ordeal. Natural selection is a recursive process of trial and judgment by which good causes are rewarded and relative truths learnt.
A thing exists as a natural end if it is cause and effect of itself.
—Immanuel Kant (1790/2000)
Phenotype interprets genotype in environmental context. Why should genes be singled out as possessors of purposes and as self-interested beneficiaries of adaptation? Genes belong among the material causes of development, and gene expression among the efficient causes of development, but ontogeny proceeds via complex interactions between genes and environment. From the perspective of developmental systems theory, the causal matrix recreates itself, recursively, without a privileged role for genes (Oyama 2000).
Genes interact with each other and the environment to create phenotypes that causally influence which individuals leave descendants. But, when the environment chooses one allele rather than another, the choice is based on the average effect of a difference (Fisher 1941). In Lewontin’s (2000) terminology, the allelic effects are causes of difference but the interactions are causes of state. The prosaic selection of differences is the unwitting author of poetic changes of state.
Gene-selectionism is concerned with how information gets into the genome via natural selection and what can be held responsible for the appearance of purpose in nature. By contrast, developmental systems theory is concerned with understanding ontogenetic mechanisms. One might say that gene-selectionism addresses the writing, and developmental systems theory the reading, of a text. From this perspective, the two frameworks are complementary. Any text of lasting value is read, and judged, repeatedly as it is revised.
Two domains of explanation are in play that have been characterized as a vertical axis of transmission and a horizontal axis of development (Bergstrom and Rosvall 2011). One concerns the inheritance of genetic information between generations and the other the expression of genetic material within generations. Teleological concepts appear in both domains. On the axis of transmission, final causes appear as adaptations that serve the ultimate end of fitness. On the axis of expression, final causes appear as end-states of developmental processes and as the proximate ends of goal-directed behaviors. Explanations in the two domains have different flavors because mapping from gene copy to gene copy in the course of transmission is straightforward whereas mapping from genotype to phenotype in the course of development is devilishly difficult.
The conceptual separation of axes of transmission and development is related to Shea’s (2007) separation of phylogenetic and ontogenetic explanations; to Ayala’s (1970) distinction between ultimate goals and proximate ends; to Weismann’s (1890) separation of germ plasm and cytoplasm; to the difference between DNA replication and RNA transcription; to the divide between text and interpretation; and the contrast between mention and use of a lexical item. Kant (1790/2000, 243) can be interpreted as making a related distinction when he describes the twofold sense in which a tree is both cause and effect of itself. A tree generates itself both as a species/genus (transmission) and as an individual (development).
Whether conceptual separation of developmental from evolutionary questions is productive or counterproductive is a subject of present polemics. Some maintain the distinction is indispensable (Griffiths 2013), whereas others see it as an impediment to understanding (Laland et al. 2013a). Most of those who support the distinction are comfortable with invoking functions as causes, whereas many of those who want to do away with it are explicit that “functions are not causes . . . the outcome of a behavior cannot determine its occurrence” (Laland et al. 2013b).
Our penchant for dichotomies, distinctions, and oppositions reflects the power of reducing complex questions to binary choices. Many arguments within the philosophy of biology, and between the sciences and humanities, reflect a tension between the reductive simplicity of average effects and the richness of interaction; between the meager trump of attributing credit to parts and the glorious Wurlitzer of integration of wholes. But we have more than two options; one can play a duet. (Perhaps I should explain one of my more obscure word choices. A trump is an old name for a jew’s harp. My use of trump alluded to Peirce’s [1905] juxtaposition of the “meagre jews-harp of experiment” and the “glorious organ of observation.” I had no foresight of the outcome of the 2016 presidential election.)
Are God and Nature then at strife,
That Nature lends such evil dreams?
So careful of the type she seems,
So careless of the single life.
—Alfred, Lord Tennyson (1849)
Genomes resemble historical documents (Pittendrigh 1993; Williams 1992, 6). Thymine rather than adenine, or valine rather than glutamate, has no meaning out of context; but a nucleotide sequence of β-globin with thymine at position 17, or an amino acid sequence of β-globin with valine at position 6, both have meaning in context, although neither says anything explicit about malaria. Genomes are allusive archives of choice, with unstated meanings without explicit expression or discrete location. They are palimpsests on which new text is written over partially erased older text (Haig and Henikoff 2004). Not all of the text is readable. It contains gobbledygook and epigenetic annotations that proscribe what should not be read. Genomic censors strive to shut down the clandestine presses of retrotransposons.
Where does meaning reside in a text? This chapter evolved via incremental rewording and extensive rewriting. There was a struggle for existence among ideas for space on the page. There is a lot more I could have said. My meaning resides in the difference between what is said and unsaid. Often a change in one part necessitated changes in other parts to maintain consistency. This chapter self-consciously reflects back upon itself with repetition, recurrence, reciprocal reference, and allusive alliteration. Part of its meta-meaning is that many meanings are distributed throughout the text, never fully explicit, to reflect and suggest the organization of meanings within the genome. There is no meaning in a letter, a little in a word, a bit more in a sentence, but much of the intended meaning is implicit, to be understood from the synergistic whole rather than the additive parts. And yet, the text was written letter by letter and word by word by additive increments. On the axis of reading, new meanings can be found, but on the axis of transmission it is only that which is written that counts.
Meaning resides in the interpretation. There are meanings I intend you to find and meanings you find. I wrote to persuade. But you may use my prose to persuade others that I am mistaken. You interpret my text as you will. Imprecision of language allows charity of interpretation and slaying of straw men. Falsehood can arise from misinformation by an author or from misinterpretation by a reader.
The question of what genes mean, if what genes do depends on interactions with other genes in environmental context, resembles the question of what words mean when all definitions are expressed in other words in semantic context. Modern philosophers confront the “indeterminacy of translation” when attempting to understand what aition meant to Aristotle and “indeterminacy of interpretation” when attempting to understand, or deliberately misunderstand, each other’s arguments. Modern biologists confront similar indeterminacy in the semantic content of genetic material. Critics of “information talk” in biology often demand a more rigorous justification of meaning in DNA than they could provide for meaning in language.
An idea is the semantic equivalent of a nonrecombining segment of DNA. It is a chunk of meaningful stuff that is transmitted as a parcel. It is a semantic difference that makes a difference. Ideas and “pithy quotations” are readily reusable because they are meaningful when taken out of context. Science proceeds via recombination of ideas, whereas great works of literature are clonally replicated and interpreted as wholes. In the scientific literature, “smallest publishable units” have replaced magisterial tomes in part because shorter texts are more likely to be used and cited. Working biologists mostly read On the Origin of Species for virtue or pleasure, because the good bits have been reused again and again, in new associations, in a sesquicentury of scientific endeavor.
There are parallels between the ascription of effects to genes and the assignment of credit to authors. Scientists cite each other more than philosophers, and novelists hardly at all. Citations not only provide pointers to additional information but also ascribe credit. All new insights originate in the context of many acknowledged and unacknowledged precursors, but credit is easier to attribute, or harder to deny, for portable ideas than for rearrangements in the tangled web of meanings. Tristram Shandy contains philosophical insights but is rarely cited by philosophers because discrete ideas are difficult to disentangle from its interwoven fabric.
Scientists care about citation because they want their name to ride the coattails of successful ideas to feedback for their good. But to be worthy of credit one must be unambiguous. Otherwise one could claim credit for interpretations that prove prescient but shift blame for interpretations that fail. A scientist is expected to commit to one interpretation, but a novelist often leaves a choice for the reader. Indeterminacy of interpretation is a designed feature of novels but a flaw in experimental notebooks and scientific papers.
In an indeterministic world natural causation has a creative element, and science is interested in locating the original causes of effects of special interest, and not merely in pushing a chain of causation backwards ad infinitum.
—Ronald Fisher (1934)
Consider the fates of zygotes, scions of countless spermatic races to ova. Their lives unfold via interactions among genes, and between genes and environment. Many fall by the wayside, by chance or necessity, and those that reach maturity produce progeny, some a hundredfold, some sixtyfold, some thirtyfold. Sometimes an allelic difference causes one to leave more issue than another. And, lo and behold, the genes of the progeny, and of the progeny’s progeny, even unto the third and fourth generation, are a biased sample of the genes of their progenitors. The tale is repeated, with minor variations and mutations, time without end, and verily there is something new under the sun.
This evolutionary parable could be elaborated endlessly with causal explanations of ever-finer detail and ever-deeper regression into the past. There is a causal story behind each and every mutation, each and every chiasma, each and every choice of a mating partner, each and every union of gametes, each and every catastrophe that did not happen. But this story is untellable because of incomplete information, chaotic dynamics, and computational complexity. And if it could be told, the story would be incomprehensible. One must simplify to tell a tale, giving greater salience to some items and leaving loose ends.
A pedant could argue that pressure is not an efficient cause and should be expunged from physical explanations—only individual molecular impacts are truly causal—but his argument would be dismissed as obfuscation. For questions at the appropriate scale, pressure provides a perfectly adequate explanation, indeed one that is superior to the unattainable account that describes each and every molecular collision. Darwinian final causes are similarly grounded in efficient causes and are perfectly adequate, indeed indispensable, for certain kinds of biological explanation. A “selection pressure” summarizes many reproductive outcomes just as the pressure of a gas summarizes many molecular motions. Darwinism, like thermodynamics, is a statistical theory that does not keep track of every detail (Fisher 1934; Peirce 1877).
Much recent semantic work has been undertaken on concepts of Darwinian information (Adami 2002; Adami, Ofria, and Collier 2000; Colgate and Ziock 2011; Frank 2009, 2012). The various expositions exhibit phenotypic resemblance, both from shared ancestry and convergence in a common selective environment, although conceptual differences remain. Rather than choose among the differences, I will synthesize a subset of select conclusions. Semantic information comes from the environment via subset selection and refers to that environment. It is functional, looking backward to what has worked in the past and forward as a prediction of what will work in the future. Replication is essential for the indefinite persistence of information in the face of dissipative entropic forces.
The word “cause” is so inextricably bound up with misleading associations as to make its complete extrusion from the philosophical vocabulary desirable.
—Bertrand Russell (1913)
My intent in partial rehabilitation of formal and final causes is not to argue that the four causes provide the best causal taxonomy for current ends, but to recognize that Aristotle’s classification was found useful for more than a millennium and must surely have approximated significant categories of understanding. Moreover, if formal and final causes do not exist in their “bad” metaphysical senses, then the terms and the concepts are available for use in their “good” post-Darwinian senses of inherited information and adaptive function.
This chapter concerns the seduction of narrative, the magic of metaphor, and the rhythm of recursion (Hofstadter 1979). Meaning is expressed through metaphor by representing one thing by another. Recursive representation allows eidos and telos to be grounded in hyle and kinesis. Choice captures information. The environment, personified as natural selection, chooses ends and thereby chooses means with meanings, because the ends of the past are the means of the present. Meaning requires an interpreter and an end. Darwin’s demon supplies both. My text returns repeatedly to etymologies and histories of ideas because logos and eidos evolve by paths parallel to genes, providing fruitful metaphors and philosophical perspective.
Natural selection is both a metaphor and a metaphorical process of recursive representation. It is a meaningless, purposeless, physical algorithm that produces things for which meaning and purpose are useful explanatory concepts (Dennett 1995). Among the products of natural selection are rational agents, with beliefs and desires, pursuing conscious goals, exchanging truthful and deceptive information, who can delight in a meaningful life.
L—d! said my mother, what is all this story about?—A COCK and a BULL said Yorick—And one of the best of its kind I ever heard. (Sterne 1767, finis)
1 This is my footnote to Plato.