Evolution and Computing
Robert T. Pennock
OUTLINE
1. Unexpected links and shared principles
2. How evolutionary biology joined forces with computer science
3. How evolutionary computation is helping evolutionary biology
4. Evolutionary computation takes off
5. The future of evolution and computing
Shared principles between evolution and computing are opening up fruitful areas for research. This chapter discusses some unexpected connections between evolutionary biology and computer science, such as the core ideas of code, information, and function, and how these are leading to theoretical and practical ways in which each is benefiting the other. The chapter highlights the emerging field of evolutionary computation, giving a brief history and some examples of its utility not only in helping solve basic research problems in biology and computer science but also for generating novel designs in engineering.
GLOSSARY
Digital Evolution. The evolution of digital organisms in a system that instantiates the causal processes of the evolutionary mechanism through random variation, inheritance, and natural selection.
Digital Organism. A model organism, typically with a genome composed of simple instructions, in a computer environment.
Evolutionary Computation. The general term for research and procedures in computer science that take inspiration and utilize insights from evolutionary biology.
Evolutionary Engineering. Use of evolutionary computation approaches for solving design problems in applied engineering contexts, including robotics.
Experimental Evolution. Investigation of evolutionary processes by direct experimental methods, including replications and controls, rather than by indirect comparative methods.
Genetic Algorithm. One form of evolutionary computation; pioneered by John Holland.
What does evolution have to do with computing? What does computing have to do with evolution? At first glance, these fields almost seem to be opposites. On the one hand, evolutionary biology deals with the lush and tangled extravagance that is the living world. Living organisms grow, reproduce, and proliferate in abundant variety and complexity. It was Charles Darwin’s genius that began to unravel this complexity and discovered some of the fundamental principles that produce new species and their astounding adaptations. In the century and a half since the publication of On the Origin of Species, evolutionary science has become a powerful explanatory framework that illuminates the entire organic world.
Computer science, on the other hand, deals not with organisms but with machines. Machines may get bigger, and computing machines have gotten more powerful, but they don’t grow—they are built. The artificiality of computers stands in stark contrast to the naturalness of organisms. Computers are complex, but in quite a different way than living things. One would never confuse the specific patterns of complexity that characterize computing machines designed and built by human beings with the patterns that we find in evolved, living organisms. It is differences of this sort between the natural biological world and the technological world of objects designed and built by human beings that initially made it questionable whether it was even sensible to think that there could be, in Herbert Simon’s term, a “science of the artificial” (Simon 1969).
This chapter discusses some of the ways in which evolutionary biologists and computer scientists are discovering, to their mutual benefit, that their fields actually have many concepts in common and that there are significant ways in which they may be united through deep, shared principles. After reviewing some of these principles, we will briefly look at how computer science came to recognize the applicability of evolution to computer science and began to figure out how to incorporate Darwin’s findings into its own algorithmic way of thinking. We will see how the mechanism of evolution by natural selection that Darwin discovered can now be not just simulated but actually causally instantiated in a computer, and how this opens the door to surprising new ways for biologists to experimentally investigate evolutionary processes. And finally, we will look at how this new field of evolutionary computation can be applied in practical ways, such as in solving difficult design problems in engineering. As we shall see, evolutionary design has reached the point at which it can equal and sometimes surpass our own problem-solving abilities.
1. UNEXPECTED LINKS AND SHARED PRINCIPLES
The obvious contrasts between living organisms and computing machines hide significant points of commonality between the two fields of study because they focus on products rather than on processes. Once we begin to compare biology and computing in terms of processes, we find significant and fundamental linkages that were previously overlooked.
One conceptual commonality is the idea that both fields deal at the deepest level with the idea of coded functions. On the computational side, even laypersons understand that computer code runs everything from their notebook or tablet computer to the largest mainframe. It is the coded instructions of the software loaded in one’s machine that make it function. On the biological side, everyone knows that organisms similarly depend on their genetic code for their functions. A mistake in the coded program of life can make an organism unable to perform some function as surely as a mistake in a software program can cause a function error. With little exaggeration, one may say that natural organisms are biological machines that run on genetic software that codes for the myriad, complex features that make them work in their native environments. Or that make them fail to work. A severe mistake in the genetic code of an organism can cause it to die, just as a serious coding mistake in some application you are running can bring up the dreaded blue screen of death.
A second, closely related, conceptual commonality between both fields is the idea of information. Here, too, even the language of the two disciplines resonates with deeply shared notions about the significance of information and its flow. For instance, rather than speaking narrowly of computer science, it is becoming more common for many computer scientists to identify their work as information science. Again, the term computer science makes it seem as though their subject matter is computers, whereas they take their real subject matter to be computing, which they see as the most basic form of information processing. An information-theoretical approach has not yet been developed nearly as far on the biological side, but here, too, the language of the discipline rings with this idea. An organism’s genome is said to code for “biological information,” while RNA and DNA are spoken of as “informational molecules.”
These and other commonalities have long hovered in the background of scientific investigations in both the biological and computing communities, but as the deep conceptual connections are becoming more appreciated, they are coming to the foreground and being recognized as providing an opportunity for cutting-edge research.
To give just one example, in 2011 the National Science Foundation (NSF) published a letter to researchers calling attention to what it called Biological and Computing Shared Principles (BCSP). Issued jointly by NSF’s Biological Sciences (BIO) and Computer & Information Science & Engineering (CISE) directorates, the BCSP letter highlighted a revolutionary transition occurring in the relationship between the fields. There have always been points of mutual influence between biological and computing research, but these are no longer limited to applications of one discipline to the other; the letter highlights “the convergence of central ideas and problems requiring the theoretical, experimental, and methodological competencies of both biology and computing” (US NSF 2011). The reason for the excitement is that shared principles between biology and computing may contribute to conceptual advances for both fields.
The BCSP letter identifies a variety of novel areas that are ripe for the identification and investigation of shared principles. Many of these involve specific properties of common interest such as adaptation to unanticipated novel conditions; self-repair and maintenance; coevolution and defense against adaptive adversaries; and general robustness and reliability. Other topics are more abstract, such as knowledge extraction; information flow, processing, and analysis; representations and coding; pattern recognition and pattern generation; network structure, function and dynamics; functions of stochasticity; and theory of biological computation. Although the BCSP letter speaks broadly about computing and “biology,” in fact, many of these properties are connected conceptually to evolutionary biology.
2. HOW EVOLUTIONARY BIOLOGY JOINED FORCES WITH COMPUTER SCIENCE
Computers are now as much a key instrument in biology today as the microscope was in the nineteeth century. They no longer serve just as fancy calculators that make statistical analysis of lab and field data go faster; their flexibility and power as a universal machine now allows them to also serve a fundamental role in the production of data. One important new role is to allow sophisticated simulations of complex biological entities and processes. To give just one example, recent work by Donahue and Ascoli used computers to model the morphology of neurons and the developmental processes that lead to the elongation, branching, and taper of dendrites. Starting with parameters measured from real cells, modelers can create statistical distributions that can be resampled to form virtual trees and even to simulate somatic repulsive forces thought to be responsible for shaping cells. Such simulations can reveal patterns that may point to important developmental principles. For studying evolution in particular, an even more important advance is that computers now allow scientists to model evolutionary processes directly.
The insight that led to this revolutionary approach was made independently by several researchers (most in the 1960s), who recognized that the mechanism of evolution that Darwin discovered could be instantiated not just in biological systems but in other physical systems as well, including in computers. Probably the most influential of these was University of Michigan computer scientist John Holland, who coined the term genetic algorithm for the idea and who implemented it at the level of binary strings of 0s and 1s that could recombine, mutate in a computer, and be subject to selection. In Germany, aeronautical engineer Ingo Rechenberg had a similar idea and developed it with Hans-Paul Schwefel under the name evolutionary strategies. A third line of research, dubbed evolutionary programming, was begun by electrical engineer Lawrence Fogel. For their pioneering work, Holland, Rechenberg, and Fogel are credited as founders of what now goes by the general term evolutionary computation. This is not the place to recount the history of these and other early pioneers, but it is worth mentioning that these initial research streams proceeded separately for over a decade and a half before they discovered one another and began to interact. Today it is recognized that these and other evolutionary computation approaches share the same underlying core principles (De Jong 2006), and a community of researchers has formed around these ideas, spawning a variety of professional societies, conferences (many of which eventually joined together as GECCO, the Genetic and Evolutionary Computation Conference), and journals. Evolutionary Computation, the main journal in the field, was introduced in 1993.
This short history returns us to the idea of shared principles between evolution and computing. When evolution is seen as a special sort of algorithmic process, then it becomes possible for a computer to become the evolutionary biologist’s lab bench. Properly understood, digital evolution can do more than simulate evolutionary processes, it can instantiate them (Pennock 2007). To see this we need only review the basic elements of the causal principle that Darwin discovered.
Descent with modification, as Darwin defined evolution, occurs whenever three conditions hold. The first is the random production of variations—a diversity of structure, constitution, habits, and so on. The second is that these variations be heritable, meaning that they can be passed on in the process of reproduction to the next generation. Darwin called this the “principle of inheritance.” The method of inheritance is not so important as the basic causal principle of heritability itself—the key is that the genetic information be copied to the offspring, not the specific mechanism by which that is done. The third condition is that these heritable variations be naturally selected by the environment. If the genome of an individual happens to provide it with some slight variation that gives it any advantage over its competitors in their environment, that individual becomes more likely to survive to the point that it can reproduce, which causes the next generation to have a greater proportion of individuals with its heritable variations than those of its competitors. It is the environment, understood broadly, that naturally selects from among the extant variations, generation after generation. All these causal processes—random variation, heritability, and natural selection—can be instantiated in a computer. (See plate 7.)
Moreover, once one comes to see evolution as a universal causal law—one whose action is not limited to the familiar realm of DNA and flesh—then it becomes easier to see the possibility of sharing other biological concepts with computer science. To give just one example, the concept of a genome is not limited to the chromosomes of an organism; it also refers more generally to the complete information-transmitting material of any replicating entity, whether a biological organism or a self-replicating computer program. Indeed, chromosomes are best conceived of as but one possible instance of a genome. Other structures could have been found, and may yet be found, that carry heritable genetic information. Historically, it was not until many decades after the Origin that chromosomes were identified as the location of the genetic material. Darwin had simply spoken of “factors” in an abstract causal sense and it was not relevant to his law in which material they turned out to be instantiated. In this sense, it is not a metaphor or an analogy to speak of a coded sequence of instructions of a digital organism as its genome, for it is the genetic material of that individual in just the same sense as it is for a biological organism.
There is not the space here to lay out the full argument for this claim, but the rationale for understanding these concepts at this level of abstraction should be clear enough to see the value of bringing them and other shared concepts and principles to the foreground. Recognizing that the evolutionary causal processes can be instantiated in other physical systems, including in a computer, means that digital evolution goes beyond even the utility that a simulation can provide and provides a truly experimental system.
3. HOW EVOLUTIONARY COMPUTATION IS HELPING EVOLUTIONARY BIOLOGY
The late John Maynard Smith, a distinguished evolutionary biologist, was quick to recognize the scientific potential of marrying evolution and computing. In particular, he called attention to how digital evolution provides a way for biologists to escape from the inconvenient limits of our single planet. “So far,” he wrote in a 1992 article in Nature, “we have been able to study only one evolving system, and we cannot wait for interstellar flight to provide us with a second. If we want to discover generalizations about evolving systems, we will have to look at artificial ones.” Since then, others have opened this digital wormhole further.
To illustrate some of the advantages of digital evolution as a model system, we describe a study that used digital evolution to investigate the evolution of complex features (Lenski et al. 2003). Darwin advanced a number of hypotheses about how evolution could produce what he called “organs of extreme perfection and complication.” Recognizing that features such as the eye were too complex to have arisen in a single leap, he proposed that their evolution would have involved incremental changes through intermediate forms, including changes of structures from one function to another. Darwin provided indirect evidence for his hypotheses by comparisons across different species but wrote that it would have been ideal if it were possible to precisely trace the details of a single line of descent. With an evolving digital system this is now a reality.
This study used the Avida platform, which is a well-developed model system for digital evolution research. In Avida, the genome of a digital organism (an “Avidian”) is composed of simple computer instructions that do little of interest by themselves but when ordered in specific complex sequences can in principle perform any computable function. Such computational functions are the digital organism’s phenotype. Among other properties, the genome of an Avidian has the potential ability to self-replicate. In its digital environment, however, the copying process is imperfect, so descendant organisms may have random mutations in their code. They also have to compete for the energy needed to execute their genetic programs. In this system, the digital organisms get energy by performing logical operations that also require specific sequences of instructions to function. Simple functions provide the organism with a small energy boost; more complex functions provide more energy, allowing them to run faster. This process provides an analogue to biological metabolism, but if a digital organism is to perform more complex metabolic functions, it must evolve them, because the ancestor could replicate but not perform any logic functions. As in nature, the digital environment naturally selects those variations that give organisms a competitive advantage. The mutations that arise in the genome are usually deleterious (and some may destroy the Avidian’s ability to replicate) or neutral, but a few may improve the organism, resulting in faster replication and thus more offspring with those variations. With all the conditions of the Darwinian mechanism in place, a population of Avidians naturally evolves on its own without any outside assistance.
To investigate how complex features evolve, Lenski and collaborators ran 50 replicate populations, all under identical conditions. Ensuring identical replicates is simple in a digital evolution experiment and more precise than can be done in most natural systems, so statistical replication is simple. The digital system provided other advantages that no natural system could match, in that it allowed the experimenters to track complete lines of descent (from the ancestor across thousands of generations and millions of descendants), record every mutation along the way, and measure whether each was deleterious, neutral, or beneficial. The researchers could thus observe directly as beneficial mutations accumulated to produce one or another logic function, which themselves were later lost or modified for some other, more complex function. And although a given complex function first emerged in a line of descent as the result of just one or two mutations, systematic knockout experiments (removing individual instructions of the genome one at a time) showed that the function always depended on some specific sequence of many instructions that had previously evolved as parts of other functions and that their removal would eliminate the new feature. In an evolving digital system one has the ability to monitor such changes and to analyze them with an extraordinary degree of precision.
Nor are the results of such experiments always predictable. Because this is a real evolving system rather than a numerical simulation, the dynamics of the evolutionary mechanism can yield surprising results, just as in a biological evolving system. One unexpected finding in this particular study was that occasional deleterious mutations were in the line of descent leading to the most complex function in some populations. Some were only slightly deleterious, but a couple reduced fitness by more than 50 percent. Further tests ruled out the possibility that these mutations were just accidental hitchhikers; although the mutations were highly deleterious when they occurred, they became highly beneficial in combination with subsequent mutations.
This study is but one of many that used digital organisms to examine basic questions about evolutionary processes that would have been exceedingly difficult or impossible to perform with biological organisms. In a digital evolution system, one may “replay the tape of life,” as Stephen Jay Gould put it, and directly observe, for example, the role that historically contingent events such as mass extinctions can play in the course of evolution. One can devise appropriate controls to test how altruistic behaviors evolve under different selective pressures. One also may investigate the effect of natural selection on different methods of phylogeny reconstruction. Digital evolution researchers have done all this and more.
This is not to say that digital evolution works as a model system for all the kinds of questions evolutionary biologists want to ask. While digital evolution instantiates the core causal processes of the evolutionary mechanism, it does not model, for instance, the defining features of any particular species or the unique properties of particular molecular structures. Thus it will be of little use to someone investigating questions for which such physical structures are salient. But for the biologist interested in questions about the cause-effect relationships of Darwin’s law or seeking generalizations that will apply to any evolving system, experimental evolution with digital organisms is a revelation.
Evolution, broadly understood, requires neither DNA nor even living organisms. Evolutionary computation can help biologists understand shared properties such as robustness and nonexpressed code (Foster 2001). The power and flexibility of digital evolution gives researchers unprecedented opportunities to test evolutionary hypotheses, especially those requiring manipulations that are impractical in biological systems or numbers of generations that cannot directly be observed. Equally exciting are the practical applications of these shared principles embodied in evolutionary computation to such fields as engineering.
4. EVOLUTIONARY COMPUTATION TAKES OFF
Asked in 2010 for his judgment about the future course of his field, the president of the National Academy of Engineering, Charles M. Vest, wrote in the New York Times that “we’re going to see in surprisingly short order that biological inspiration and biological processes will become central to engineering real systems. It’s going to lead to a new era in engineering.” Vest was no doubt thinking of a range of ways that engineering is beginning to make what we might call “the biological turn,” including biomimicry and other uses of the products of evolution, but the use of evolutionary computation is certainly one of the most compelling ways that biological processes are being applied in engineering. As before, let us look at just one example in a bit of detail to show just how far evolutionary engineering has gone—in this case literally into outer space.
In 2004 NASA was preparing for the launch of its Space Technology 5 mission, whose aim was to test technology for measuring the effect of solar activity on the earth’s magnetosphere. One of NASA’s needs for the mission was a specialized antenna that had to meet a variety of precise specifications. Given certain transmit and receive frequencies, it had to operate within specified ranges for important functional properties. Moreover, the antenna had to fit within a 6 in. cylinder. Members of the Evolvable Systems Group at the NASA Ames Research Center decided to see whether an evolutionary approach could solve the problem.
The research team began by setting up a virtual world with a genetic encoding scheme that could represent the construction of three-dimensional wire forms—the space of possible antenna shapes. They used a tree-structured encoding in which, for example, a branch in the genotype would represent a branch in the wire form. The genotype was allowed to vary at random so as to produce diverse forms with different numbers, lengths, and angles of branches. In an initial population, 200 individuals were evaluated for their fitness for the task that NASA had set—think of this as a competition in which the virtual antennas were tested against each other. The best individuals in a given generation were automatically selected, and most of these were again randomly mutated or recombined to form new variations for the next generation. As this process was iterated over many generations, the shapes of the individuals in the population evolved little by little to better match the functional requirements that NASA had set. The antenna shapes that evolved in these runs were unlike anything that antenna engineers would have come up with themselves. They were very small—not much bigger around than a quarter—and they looked rather like a bunch of randomly twisted paper clips. But at the end of the run, the team built the device that had evolved in the virtual world and found that this misshapen bunch of wires met the required specifications.
The evolved antenna had additional technical benefits, including requiring less power, having more uniform coverage, and not requiring a matching network or a phasing circuit, thus simplifying the design and fabrication. Moreover, all these benefits were achieved with a shorter design cycle: the prototype for the antenna took three person-months to design and fabricate, compared with five person-months for a conventionally designed antenna (Hornby, Lohn, and Linden 2011).
What is especially impressive about this evolved design is that it succeeded where human engineers had failed. Prior to the Evolvable System Group’s tackling the problem, NASA had contracted with an antenna engineering group that had produced a prototype design using conventional techniques, after a bidding process among several competing groups. However, the conventional design they produced did not meet the exacting mission requirements, while the evolved design did.
On March 22, 2006, the evolved antenna was launched into space. This was the first time that evolved hardware had reached such heights, both metaphorically and physically, but it was not an isolated instance of the power of evolutionary engineering. Indeed, evolutionary approaches have advanced to the degree that they now routinely equal or surpass human engineers in a variety of design tasks (Koza 2003).
This might seem to a bold claim. By what measures can one say that evolutionary designs can equal or surpass those of human beings? Since 2004, GECCO has held a contest for human-competitive results, and it judges entries using a variety of criteria. To qualify for the competition, an evolved solution must meet at least one of several standards, such as producing a patentable design or a result that is publishable in its own right in a peer-reviewed journal, independent of the fact that is was mechanically created. The evolved antenna shared the Gold Award in 2004. Since then winners have been recognized for human-competitive results in areas as disparate as photonic crystal design, automated software repair, and protein structure prediction. These awards for human-competitive evolved design are appropriately called the “Humies.”
5. THE FUTURE OF EVOLUTION AND COMPUTING
One promise of evolutionary approaches to computation and engineering is that they will solve real-world problems. This promise is already being fulfilled: evolutionary computation harnesses the power of evolution, allowing evolutionary processes to work in a digital world just as they work in nature. A further promise of evolutionary computation is that it will help reveal the deeply shared principles between what initially appeared to be quite distinct fields of research. Think of what it means to recognize that it is not a mere metaphor to say that living organisms are running a genetic program. Think of what it means to understand that the functional properties of life—those astounding adaptions—that are coded in the genome were programmed by evolution. If this lesson can be learned, the marriage of evolution and computing will have been profound indeed.
See also chapter VIII.7 and chapter VIII.15.
FURTHER READING
Avida-ED Project: Technology for teaching evolution and the nature of science using digital organisms. http://avida-ed.msu.edu/. With free downloadable digital evolution software as well as background materials and model exercises for undergraduate and AP biology courses, this award-winning project gives students everything they need to perform evolutionary experiments on their own computers.
Clune, J., H. Goldsby, C. Ofria, and R. T. Pennock. 2010. Selective pressures for accurate altruism targeting: Evidence from digital evolution for difficult-to-test aspects of inclusive fitness theory. Proceedings of the Royal Society B 278: 666–674. One example of the kind of basic science research that experimental evolution with digital organisms makes possible, this study tested hypotheses about inclusive fitness and the evolution of altruistic behavior.
De Jong, K. A. 2006. Evolutionary Computation: A Unified Approach. Cambridge, MA: MIT Press. An authoritative account of the common theoretical underpinnings of different varieties of evolutionary computation.
Foster, J. A. 2001. Evolutionary computation. Nature Reviews Genetics 2: 428–36. An excellent review article giving an overview of the field of evolutionary computation for biologists.
Holland, John H. 1992. Adaptation in Natural and Artificial Systems. Cambridge, MA: MIT Press. Holland’s pioneering book, originally published in 1975, gave a detailed account of how what he called genetic algorithms could model the process of evolutionary adaptation in a computer system.
Hornby, G. S., J. D. Lohn, and D. S. Linden. 2011. Computer-automated evolution of an X-band antenna for NASA’s Space Technology 5 Mission. Evolutionary Computation 19 (1): 1–23. Scientific account of how a NASA space antenna was produced by evolution engineering techniques.
Koza, J. R., M. A. Keane, M. J. Streeter, and W. Midlowec. 2003. Genetic Programming IV: Routine Human-Competitive Machine Intelligence. Norwell, MA: Kluwer Academic. The fourth in a series by John Koza and colleagues about genetic programming, this books lays out the case for how evolutionary techniques have advanced to be able to routinely match human intelligence for a wide variety of problems.
Lenski, R., C. Ofria, R. T. Pennock, and C. Adami. 2003. The evolutionary origin of complex features. Nature 423: 139–144. This pioneering study used digital evolution to perform a direct experimental test of some of Darwin’s hypotheses about the evolutionary mechanisms that produce complex features.
Pennock, R. T. 2007. Models, simulations, instantiations and evidence: The case of digital evolution. Journal of Experimental and Theoretical Artificial Intelligence 19 (1): 29–42. This paper sorts out common confusions about model-based reasoning using digital evolution, explaining why it is a mistake to think of these models as just simulations of evolution and why they are real instances of the causal mechanism that Darwin discovered.
Simon, H. A. 1969. The Sciences of the Artificial. Cambridge, MA: MIT Press. Herbert Simon, who was later to win a Nobel Prize, wrote this prescient book about why artificial systems, including computers, could properly be treated as objects for scientific study.