2

Enter the Demon

‘Could life’s machines be Maxwell demons, creating order out of chaos …?’

– Peter Hoffmann1

In December 1867 the Scottish physicist James Clerk Maxwell penned a letter to his friend Peter Guthrie Tait. Though little more than speculative musing, Maxwell’s missive contained a bombshell that still reverberates a century and a half later. The source of the disruption was an imaginary being – ‘a being whose faculties are so sharpened that he can follow every molecule in its course’. Using a simple argument, Maxwell concluded that this Lilliputian entity, soon to be dubbed a demon, ‘would be able to do what is impossible to us’. On the face of it, the demon could perform magic, conjuring order out of chaos and offering the first hint of a link between the abstract world of information and the physical world of molecules.

Maxwell, it should be stressed, was an intellectual giant, in stature comparable to Newton and Einstein. In the 1850s he unified the laws of electromagnetism and predicted the existence of radio waves. He was also a pioneer of colour photography and explained Saturn’s rings. More relevantly, he made seminal contributions to the theory of heat, calculating how, in a gas at a given temperature, the heat energy was shared out among the countless chaotically moving molecules.

Maxwell’s demon was a paradox, an enigma, an affront to the lawfulness of the universe. It opened a Pandora’s box of puzzles about the nature of order and chaos, growth and decay, meaning and purpose. And although Maxwell was a physicist, it turned out that the most powerful application of the demon idea lay not in physics but in biology. Maxwell’s demonic magic can, we now know, help explain the magic of life. But that application lay far in the future. At the outset, the demon wasn’t intended to clarify the question ‘What is life?’ but a much simpler and more practical one: namely, what is heat?

MOLECULAR MAGIC

Maxwell wrote to Tait at the height of the Industrial Revolution. Unlike the agricultural revolution of the Neolithic period which pre-dated it by several thousand years, the Industrial Revolution did not proceed by trial and error. Machines such as the steam engine and the diesel engine were carefully designed by scientists and engineers familiar with the principles of mechanics first enunciated by Isaac Newton in the seventeenth century. Newton had discovered the laws of motion, which relate the force acting on a material body to the nature of its movement, all encapsulated in a simple mathematical formula. By the nineteenth century it was commonplace to use Newton’s laws to design tunnels and bridges or to predict the behaviour of pistons and wheels, the traction they would deliver and the energy they would need.

By the middle of the nineteenth century physics was a mature science, and the welter of engineering problems thrown up by the new industries provided fascinating challenges for physicists to analyse. The key to industrial growth lay, then as now, with energy. Coal provided the readiest source to power heavy machinery, and steam engines were the preferred means of turning the chemical energy of coal into mechanical traction. Optimizing the trade-off between energy, heat, work and waste was more than just an academic exercise. Vast profits could hinge on a modest improvement in efficiency.

Although the laws of mechanics were well understood at the time, the nature of heat remained confusing. Engineers knew it was a type of energy that could be converted into other forms, for example, into the energy of motion – the principle behind the steam locomotive. But harnessing heat to perform useful work turned out to involve more than a simple transfer between different forms of energy. If we had unrestricted access to all heat energy, the world would be a very different place, because heat is a highly abundant source of energy in the universe.fn1 The unrestricted exploitation of heat energy would, for instance, enable a spacecraft to be propelled entirely from the thermal afterglow of the Big Bang. Or, coming closer to home, we could power all our industries on water alone: there is enough heat energy in a bottle of water to illuminate my living room for an hour. Imagine sailing a ship with no fuel other than the heat of the ocean.

Sadly, it can’t be done. Pesky physicists discovered in the 1860s a strict limit on the amount of heat that can be converted into useful mechanical activity. The constraint stems from the fact that it is the flow of heat, not heat energy per se, that can perform work. To harness heat energy there has to be a temperature difference somewhere. Simple example: if a tank of hot water is placed near a tank of cold water, then a heat engine connected to both can run off the temperature gradient and perform a physical task like turning a flywheel or lifting a weight. The engine will take heat from the hot water and deliver it to the cold water, extracting some useful energy on the way. But as heat is transferred from the hot tank to the cold tank, the hot water will get cooler and the cold water will get warmer, until the temperature difference between the two dwindles and the motor grinds to a halt. What is the best-case scenario? The answer depends on the temperatures of the tanks, but if one tank is maintained (by some external equipment) at boiling point (100oC) and the other at freezing point (0oC), then it turns out that the best one can hope for – even if no heat is wasted by leaking into the surroundings – is to extract about 27 per cent of the heat energy in the form of useful work. No engineer in the universe could better that; it is a fundamental law of nature.

Once physicists had figured this out, the science known as thermodynamics was born. The law that says you can’t convert all the heat energy into work is the second law of thermodynamics.fn2 This same law explains the familiar fact that heat flows from hot to cold (for example, steam to ice) and not the other way around. That being said, heat can pass from cold to hot if some energy is expended. Running a heat engine backwards – spending energy to pump heat from cold to hot – is the basis of the refrigerator, one of the more lucrative inventions of the Industrial Revolution because it allowed meat to be frozen and transported over thousands of miles.

To understand how Maxwell’s demon comes into this, imagine a rigid box containing a gas that is hotter at one end than the other. At the micro-level, heat energy is none other than the energy of motion – the ceaseless agitation of molecules. The hotter the system, the faster the molecules move: at the hot end of the box, the gas molecules move faster, on average, than they do at the cooler end. When the faster-moving molecules collide with the slower-moving ones they will (again, on average) transfer a net amount of this kinetic energy to the cooler gas molecules, raising the temperature of the gas. After a while the system will reach thermal equilibrium, settling down at a uniform temperature partway between the original high and low temperature extremes of the gases. The second law of thermodynamics forbids the reverse process: the gas spontaneously rearranging its molecules so the fast-moving ones congregate at one end of the box and the slow-moving ones at the other. If we saw such a thing, we would think it miraculous.

Although the second law of thermodynamics is easy to understand in the context of boxes of gas, it applies to all physical systems, and indeed to the entire cosmos. It is the second law of thermodynamics that imprints on the universe an arrow of time (see Box 3). In its most general form, the second law is best understood using a quantity called entropy. I shall be coming back to entropy in its various conceptions again and again in this story, but for now think of it as a measure of the disorder in a system. Heat, for example, represents entropy because it describes the chaotic agitation of molecules; when heat is generated, entropy rises. If the entropy of a system seems to decrease, just look at the bigger picture and you will find it going up somewhere else. For example, the entropy inside a refrigerator goes down but heat comes out of the back and raises the entropy of the kitchen. Added to that, there is a price to be paid in electricity bills. That electricity has to be generated, and the generation process itself makes heat and raises the entropy of the power station. When the books are examined carefully, entropy always wins. On a cosmic scale, the second law implies that the entropy of the universe never goes down.fn3

Box 3: Entropy and the arrow of time

Imagine taking a movie of an everyday scene. Now play it backwards; people laugh, because what they see is so preposterous. To describe this pervasive arrow of time, physicists appeal to a concept called entropy. The word has many uses and definitions, which can lead to confusion, but the most convenient for our purposes is as a measure of disorder in a system with many components. To take an everyday example, imagine opening a new pack of cards, arranged by suit and in numerical order. Now shuffle the cards; they become less ordered. Entropy quantifies that transformation by counting the number of ways systems of many parts can be disordered. There is only one way a given suit of cards can be in numerical order (Ace, 2, 3 … Jack, Queen, King), but many different ways it can be disordered. This simple fact implies that randomly shuffling the cards is overwhelmingly likely to increase the disorder – or entropy – because there are so many more ways to be untidy than to be tidy. Note, however, that this is a statistical argument only: there is an exceedingly tiny but non-zero probability that shuffling a suit of jumbled-up cards will accidentally end up placing them in numerical order. Same thing with the box of gas. The molecules are rushing around randomly so there is a finite probability – an exceedingly small probability, to be sure – that the fast molecules will congregate in one end of the box and the slow ones in the other. So the accurate statement is that in a closed system the entropy (or degree of disorder) is overwhelmingly likely, but not absolutely certain, to go up, or stay the same. The maximum entropy of a gas – the macroscopic state that can be achieved by the largest number of indistinguishable arrangements – corresponds to thermodynamic equilibrium, with the gas at a uniform temperature and density.

By the middle of the nineteenth century the basic principles of heat, work and entropy and the laws of thermodynamics were well established. There was great confidence that, at last, heat was understood, its properties interfacing comfortably with the rest of physics. But then along came the demon. In a simple conjecture, Maxwell subverted this new-found understanding by striking at the very basis of the second law.

Here is the gist of what was proposed in the letter to Tait. I mentioned that gas molecules rush around, and the hotter the gas, the faster they go. But not all molecules move with the same speed. In a gas at a fixed temperature energy is shared out randomly, not uniformly, meaning that some molecules move more quickly than others. Maxwell himself worked out precisely how the energy was distributed among the molecules – what fraction have half the average speed, twice the average, and so on. Realizing that even in thermodynamic equilibrium gas molecules have a variety of speeds (and therefore energies), Maxwell was struck by a curious thought. Suppose it were possible, using some clever device, to separate out the fast molecules from the slow molecules without expending any energy? This sorting procedure would in effect create a temperature difference (fast molecules over here, slow ones over there), and a heat engine could be run off the temperature gradient to perform work. Using this procedure, one would be able to start with a gas at a uniform temperature and convert some of its heat energy into work without any external change, in flagrant violation of the second law. It would in effect reverse the arrow of time and open the way to a type of perpetual motion.

So far, so shocking. But before throwing the rule book of nature into the waste bin, we need to confront the very obvious question of how the separation of fast and slow molecules might actually be attained. Maxwell’s letter outlined what he had in mind to accomplish this goal. The basic idea is to divide the box of gas into two halves with a rigid screen in which there is a very small hole (see Fig. 3). Among the teeming hordes of molecules bombarding the screen there will be a handful that arrive just where the hole is located. These molecules will pass through into the other half of the box; if the hole is small enough, only one molecule at a time will traverse it. Left to itself, the traffic in both directions will average out and the temperature will remain stable. But now imagine that the hole could be blocked with a moveable shutter. Furthermore, suppose there were a tiny being – a demon – stationed near the hole and capable of operating the shutter. If it is nimble enough, the demon could allow only slow-moving molecules to pass through the hole in one direction and only fast-moving molecules to go in the other. By continuing this sorting process for a long time, the demon would be able to raise the temperature on one side of the screen and lower it on the other, thus creating a temperature difference without appearing to expend any energy:fn4 order out of molecular chaos, for free.

To Maxwell and his contemporaries the very idea of a manipulative demon violating what was supposed to be a law of nature – that entropy never decreases – seemed preposterous. Clearly, something had been left out of the argument, but what? Well, how about the fact that there are no demons in the real world? That’s not a problem. Maxwell’s argument falls into the category of what are known as ‘thought experiments’ – imaginary scenarios that point to some important scientific principles. They don’t have to be practical suggestions. There is a long history of such thought experiments in physics and they have frequently led to great advances in understanding, and eventually to practical devices. In any case, Maxwell didn’t need an actual sentient being to operate the shutter, just a molecular-scale device able to perform the sorting task. At the time he wrote to Tait, Maxwell’s proposal was a flight of fancy; he can have had no inkling that demonic-type entities really exist. In fact, they existed inside his own body! But the realization of a link between molecular demons and life lay a century in the future.

Image
Fig. 3. A box of gas is divided into two chambers by a screen with a small aperture through which molecules may pass one by one. The aperture can be blocked with a shutter. A tiny demon observes the randomly moving molecules and operates the shutter to allow fast molecules to travel from the left-hand chamber to the right-hand one, and slow molecules to go the other way. After a while, the average speed of the molecules on the right will become significantly greater than that on the left, implying that a temperature difference has been established between the two chambers. Useful work may then be performed by running a motor off the heat gradient. The demon thus converts disorganized molecular motion into controlled mechanical motion, creating order out of chaos and opening the way to a type of perpetual motion machine.

Meanwhile, apart from the objection ‘show me a demon!’, nothing else seemed terribly wrong with Maxwell’s argument, and for many decades it lay like an inconvenient truth at the core of physics, an ugly paradox that most scientists chose to ignore. With the benefit of hindsight, we can now see that the resolution of the paradox had been lying in plain sight. To operate effectively in sorting the molecules into fast and slow categories, the demon must gather information about their speed and direction. And as it turns out, bringing information into physics cracks open a door to a scientific revolution that is only today starting to unfold.

MEASURING INFORMATION

Before I get to the big picture I need to drill down a bit into the concept of information. We use the word a lot in daily life, in all sorts of contexts, ranging from bus timetables to military intelligence. Millions of people work for information technology companies. The growing field of bioinformatics attracts billions of dollars of funding. The economy of the United States is based in large part on information-based industries, and the field is now so much a part of everyday affairs that we usually simply refer to it as ‘IT’. But this casual familiarity glosses over some deep conceptual issues. For a start, what exactly is information? You can’t see it, touch it or smell it, yet it affects everyone: after all, it bankrolls California!

As I remarked, the idea of information derives originally from the realm of human discourse; I may ‘inform’ students about their exam results, for example, or you may give me the information I need to find the nearest restaurant. Used in that sense, it is a purely abstract concept, like patriotism or political expediency or love. On the other hand, information clearly plays a physical role in the world, not least in biology; a change in the information stored in an organism’s DNA may produce a mutant offspring and alter the course of evolution. Information makes a difference in the world. We might say it has ‘causal power’. The challenge to science is to figure out how to couple abstract information to the concrete world of physical objects.

To make progress on these profound issues it is first necessary to come up with a precise definition of information in its raw, unembellished sense. According to the computer on which I am typing this book, the C drive can store 237Gb of information. The machine claims to process information at the rate of 3GHz. If I wanted more storage and faster processing, I would have to pay more. Numbers like this are bandied about all the time. But what are Gb and GHz anyway? (Warning: this section of the book contains some elementary mathematics. It’s the only part that does.)

Quantifying information began in earnest with the work of the engineer Claude Shannon in the mid-1940s. An eccentric and somewhat reclusive figure, Shannon worked at Bell Labs in the US, where his primary concern was how to transmit coded messages accurately. The project began as war work: if you are cursed with a hissing radio or a crackly telephone line, what is the best strategy you can adopt to get word through with the least chance of error? Shannon set out to study how information can be encoded so as to minimize the risk of garbling a message. The project culminated in 1949 with the publication of The Mathematical Theory of Communication.2 The book was released without fanfare but history will judge that it represented a pivotal event in science, one that goes right to the heart of Schrödinger’s question ‘What is life?’

Shannon’s starting point was to adopt a mathematically rigorous definition of information. The one he chose turned on the notion of uncertainty. Expressed simply, in acquiring information, you are learning something you didn’t know before: ergo, the less uncertain you are about that thing. Think of tossing a fair coin; there is a 50–50 chance as to whether it will land heads or tails. As long as you don’t look when it lands, you are completely uncertain of the outcome. When you look, that uncertainty is reduced (to zero, in this example). Binary choices like heads or tails are the simplest to consider and are directly relevant to computing because computer code is formulated in terms of binary arithmetic, consisting only of 1s and 0s. The physical implementation of these symbols requires only a two-state system, such as a switch that may be either on or off. Following Shannon, a ‘binary digit’, or ‘bit’, for short, became the standard way of quantifying information. The byte, incidentally, is 8 bits (23) and is the b used in Gb (gigabyte, or 1 billion bytes). The speed of information processing is expressed in GHz, standing for ‘giga-Hertz’, or 1 billion bit flips per second. When you look at the outcome of a fair-coin toss you gain one bit of information by collapsing two equally probable states into one certain state.

What about tossing two coins at once? Inspecting the outcome yields two units of information (bits). Note, however, that when you have two coins there are now four possible states: heads-heads, heads-tails, tails-heads and tails-tails. With three coins there are eight possible states and three bits are gained by inspection; four coins have sixteen states and four bits gained; five have thirty-two states …; and so on. Notice how this goes: 4 = 22, 8 = 23, 16 = 24, 32 = 25 … The number of states is 2 raised to the power of the number of coins. Conversely, if you want the number of bits gained by observing the outcome of the coin tosses, this formula must be inverted using logarithms to base 2. Thus 2 = log24, 3 = log28, 4 = log216, 5 = log232 … Those readers familiar with logarithms will notice that this formula makes bits of information additive. For example, 2 bits + 3 bits = 5 bits because log24 + log28 = log232, and there are indeed thirty-two equally probable states of five fair coins.

Suppose now that the states are not equally probable – for example, if the coin is loaded. In that case, the information gained by inspecting the outcome will be less. If the outcome is completely predictable (probability 1), then no additional information is gained by looking – you get zero bits. In most real-world communications, probabilities are indeed not uniform. For example, in the English language the letter a occurs with much higher probability than the letter x, which is why the board game Scrabble weights letters differently in the scoring. Another example: in English, the letter q is always followed by a u, which therefore makes the u redundant; you get no more bits of information by receiving a u following a q, so it wouldn’t be worth wasting resources to send it in a coded message.

Shannon worked out how to quantify information in the non-uniform-probability cases by taking a weighted average. To illustrate how, let me give a very simple example. Suppose you flip a loaded coin in which heads occurs on average twice as often as tails, which is to say the probability of heads is 2⁄3 and the probability of tails is 1⁄3 (probabilities have to add up to 1). According to Shannon’s proposal, the number of bits corresponding to heads or tails is simply weighted by their relative probabilities. Thus, the average number of bits of information obtained from inspecting the outcome of tossing this particular loaded coin is −2⁄3log22⁄3 − 1⁄3log21⁄3 = 0.92 bits, which is somewhat less that the one bit it would have been for equally probable outcomes. This makes sense: if you know heads is twice as likely to come up as tails, there is less uncertainty about the outcome than there would be with a fair coin, and so less reduction in uncertainty by making the observation. To take a more extreme example, suppose heads is seven times as probable as tails. The average number of bits of information per coin toss is now only −7⁄8log27⁄8 − 1⁄8log21⁄8 = 0.54 bits. One way of expressing the information content of an answer to a question is as the average degree of surprise in learning the answer. With a coin so heavily loaded for heads, there generally isn’t much surprise.fn5

A moment’s reflection shows that Shannon’s analysis has an immediate application to biology. Information is stored in DNA using a universal genetic code. The information content of a gene is transmitted to the ribosome via mRNA, whereupon it is decoded and used to construct proteins from sequences of amino acids. However, the mRNA information channel is intrinsically noisy, that is, error prone (see here). The instruction manual of life is therefore logically equivalent to Shannon’s analysis of coded information sent through a noisy communication channel.

What does the surprise factor tell us about how much information an organism contains? Well, life is an exceedingly surprising phenomenonfn6 so we might expect it to possess lots of Shannon information. And it does. Every cell in your body contains about a billion DNA bases arranged in a particular sequence of the four-letter biological alphabet. The number of possible combinations is 4 raised to the power of 1 billion, which is one followed by about six hundred million zeros. Compare that to the paltry number of atoms in the universe – one followed by about eighty zeros. Shannon’s formula for the information contained in this strand of DNA is to take the logarithm, which gives about 2 billion bits – more than the information contained in all the books in the Library of Congress. All this information is packed into a trillionth of the volume of a match head. And the information contained in DNA is only a fraction of the total information in a cell. All of which goes to show just how deeply life is invested in information.fn7

Shannon spotted that his mathematical formula quantifying information in bits is, bar a minus sign, identical to the physicist’s formula for entropy, which suggests that information is, in some sense, the opposite of entropy. That connection is no surprise if you think of entropy as ignorance. Let me explain. I described how entropy is a measure of disorder or randomness (see Box 3). Disorder is a collective property of large assemblages; it makes no sense to say a single molecule is disordered or random. Thermodynamic quantities like entropy and heat energy are defined by reference to enormous numbers of particles – for example, molecules of gas careering about – and averaging across them without considering the details of individual particles. (Such averaging is sometimes called a ‘coarse-grained view’.) Thus, the temperature of a gas is related to the average energy of motion of the gas molecules. The point is that whenever one takes an average some information is thrown away, that is, we accept some ignorance. The average height of a Londoner tells us nothing about the height of a specific person. Likewise, the temperature of a gas tells us nothing about the speed of a specific molecule. In a nutshell: information is about what you know, and entropy is about what you don’t know.

As I have explained, if you toss a fair coin and look at the outcome you acquire precisely one bit of information. So does that mean every coin ‘contains’ precisely one bit of information? Well, yes and no. The answer ‘the coin contains one bit’ assumes that the number of possible states is two (heads or tails). That’s the way we normally think about tossed coins, but this additional criterion isn’t absolute; it is relative to the nature of the observation and the measurements you choose to make. For example, there is a lot of information in the figure of the ‘head’ on the heads side of the coin (same goes for the tails side). If you were an enthusiastic numismatist and had no prior knowledge of the country from which the coin came, or the year, then your quantity of relevant ignorance (‘Whose image is on the heads side of the coin?’) is much greater than one bit: it is perhaps a thousand bits. In making an observation after tossing heads (‘Oh, it’s King George V on a 1927 British coin’), you acquire a much greater quantity of information. So the question ‘How many bits of information does a coin have?’ is clearly undefined as it stands.

The same issue arises with DNA. How much information does a genome store? Earlier, I gave a typical answer (more than the Library of Congress). But implicit in this result is that DNA bases come in a four-letter alphabet – A, T, C, G – implying a one in four chance of guessing which particular base lies at a given location on the DNA molecule if we have no other knowledge of the sequence. So measuring an actual base yields 2 bits of information (log24 = 2). However, buried in this logic is the assumption that all bases are equally probable, which may not be true. For example, some organisms are rich in G and C and poor in A and T. If you know you are dealing with such an organism, you will change the calculation of uncertainty: if you guess G, you are more likely to be right than if you go for A. Conclusion: the information gained by interrogating a DNA sequence depends on what you know or, more accurately, on what you don’t know. Entropy, then, is in the eye of the beholder.fn8

The upshot is that one cannot say in any absolute way how much information there is in this or that physical system.3 It is certainly possible to say, however, how much information has been acquired by making a measurement: as stated, information is the reduction in the degree of ignorance or uncertainty about the system being measured. Even if the overall degree of ignorance is ambiguous, the reduction in uncertainty can still be perfectly well defined.

A LITTLE KNOWLEDGE IS A DANGEROUS THING

If information makes a difference in the world, how should we view it? Does it obey its own laws, or is it simply a slave to the laws that govern the physical systems in which it is embedded? In other words, does information somehow transcend (even if it doesn’t actually bend) the laws of physics, or is it merely, to use the jargon, an epiphenomenon, riding on the coat-tails of matter? Does information per se actually do any work, or is it a mere tracer of the causal activity of matter? Can information flow ever be decoupled from the flow of matter or energy?

To address these questions we first have to find a link between information and physical laws. The first hint of such a link was already there with Maxwell’s demon, but it was left as unfinished business until the 1920s. At that time Leo Szilárd, a Hungarian Jew living in Berlin, decided to update Maxwell’s thought experiment in a way that made it easier to analyse.fn9 In a paper entitled ‘On the decrease of entropy in a thermodynamic system by the intervention of intelligent beings’,4 Szilárd simplified Maxwell’s set-up by considering a box containing only a single molecule (see Fig. 4). The end walls of the box are placed in contact with a steady external heat source, which causes them to jitter about. When the trapped molecule strikes a jittering wall, energy is exchanged: if the molecule is moving slowly, it will most likely receive a kick from the wall that speeds it up. If the temperature of the external heat source is raised, the walls will shake harder and the molecule will on average end up going even faster, having bounced off the more vigorously fluctuating walls.fn10 Like Maxwell, Szilárd incorporated a demon and a screen in his (admittedly highly idealized) thought experiment, but he did away with the hole and the shutter mechanism. Instead, Szilárd’s demon can effortlessly lift the screen in and out of the box at the mid-point and by the two end walls (there would need to be slots for this). Furthermore, the screen is free to slide back and forth inside the box (without friction). The entire device is known as Szilárd’s engine.

Starting with the screen out, the demon is tasked with determining which side of the box the molecule is located in. The demon inserts the moveable screen at the mid-point of the box, dividing it in two. Next comes the key step. When the molecule strikes the screen it gives it a little push. Because the screen is free to move it will recoil and so gain energy; conversely, the molecule will lose energy. Though these little molecular knocks will be small by human standards, they can (theoretically) be harnessed to do useful work by raising a weight. To accomplish this, the weight has to be tethered to the screen on the side of the box that contains the molecule; otherwise the weight will fall, not rise (see Fig. 4c). Because the demon knows where the molecule is located, it also knows which side to attach the tether (the attachment can also, in principle, be done with negligible energy expenditure). Thus armed with this modicum of knowledge, i.e. positional information, the demon succeeds in converting some of the random thermal energy of the molecule into directed useful work. The demon can wait until the screen has been driven all the way to the end of the box, at which point it can detach the tether, lock the weight in place and slide the screen out of the box at the end slot (all steps that, again in principle, require no energy). The molecule can readily replenish the energy it expended in raising the weight when it collides again with the jittering walls of the box. The entire cycle may then be repeated.fn11 The upshot will once again be the steady transfer of energy from the heat bath to the weight, converting heat into mechanical work with 100 per cent efficiency, placing the entire basis of the second law of thermodynamics in grave danger.

Image
Fig. 4. Szilárd’s engine. A box contains a single gas molecule, which can be found in either the right or the left part of the box. (a) Initially, the position of the molecule is unknown. (b) The demon inserts a partition in the centre of the box, and then observes whether the molecule is in the right or the left. (c) Remembering this information, the demon attaches a weight to the appropriate side of the partition. (If the molecule is in the right, as shown, the demon connects the load to the right of the partition.) (d) The molecule, which is moving at high speed due to its thermal energy, collides with the partition, driving it to the left and lifting the weight. In this manner, the demon has converted the random energy of heat into ordered work by using information about the location of the molecule.

If that were the whole story, Szilárd’s engine would be an inventor’s dream. Needless to say, it is not. An obvious question hangs over the demon’s remarkable faculties. For a start, how does it know where the molecule is? Can it see? If so, how? Suppose the demon shines light into the box to illuminate the molecule; there will inevitably be some unrecoverable light energy that will end up as heat. A rough calculation suggests that the information-garnering process negates any advantage for the demon’s operation. There is an entropy price to be paid for trying to go against the second law, and Szilárd concluded, reasonably enough, that the price was the cost of measurement.

THE ULTIMATE LAPTOP

And there the matter might have rested, had it not been for the emergence of a completely different branch of science – the computer industry. While it is true that the demon has to acquire the information about the location of the molecule, that’s just the first step. That information has to be processed in the demon’s diminutive brain, enabling it to make decisions about how to operate the shutter in the appropriate manner.

When Szilárd invented his engine, information technology and computers lay more than two decades in the future. But by the 1950s general purpose digital computers of the sort we are familiar with today (such as the one on which I am typing this book) were advancing in leaps and bounds. A leading business propelling this effort was IBM. The company set up a research facility in upstate New York, recruited some of the brightest minds in mathematics and computing and charged them with the task of discovering ‘the laws of computing’. Computer scientists and engineers were eager to uncover the fundamental principles that constrain exactly what can be computed and how efficiently computing can be done. In this endeavour, the computer scientists were retracing similar steps to nineteenth-century physicists who wanted to work out the fundamental laws of heat engines. But this time there was a fascinating refinement. Because computers are themselves physical devices, the question arises of how the laws of computing mesh with the laws of physics – especially the laws of thermodynamics – that govern computer hardware. The field was ripe for the resurrection of Maxwell’s demon.

A pioneer in this formidable challenge was Rolf Landauer, a German-born physicist who also fled the Nazis to settle in the United States. Landauer was interested in the fundamental physical limits of computing. It is a familiar experience when using a laptop computer on one’s lap that it gets hot. A major financial burden of computing has to do with dissipating this waste heat, for example with fans and cooling systems, not to mention the cost of the electricity to pay for it all. In the US alone, waste heat from computers drains the GDP by $30 billion a year, and rising.fn12

Why do computers generate heat? There are many reasons, but one of them goes to the very heart of what is meant by the term ‘computation’. Take a problem in simple arithmetic, like long division, that can also be carried out with a pencil and paper. You start with two numbers (the numerator and the denominator) and end up with one number (the answer) plus some procedural scribbles needed to get there. If all you are interested in is the answer – the ‘output’, in computer jargon – then the input numbers and all the intermediate steps can be thrown away. Erasing the preceding steps makes the computation logically irreversible: you can’t tell by looking at the answer what the question was. (Example: 12 might be 6 x 2 or 4 x 3 or 7 + 5.) Electronic computers do the same thing. They take input data, process it, output the answer, and (usually only when memory needs freeing up) irreversibly erase the stored information.

Acts of erasure generate heat. This is familiar enough from the long-division example: removing the pencil workings with a rubber eraser involves a lot of friction, which means heat, which means entropy. Even sophisticated microchips generate heat when they get rid of 1s and 0s.fn13 What if one could design a computer that processed information without producing any heat at all? It could be run at no cost: the ultimate laptop!5 Any company that achieved such a feat would immediately reign supreme in the computing business. No wonder IBM was interested. Sadly, Landauer poured cold water on this dream by arguing that when the information processed in a computer involves logically irreversible operations (as in the arithmetic example above), there will inevitably be heat dissipated when the system is reset for the next computation. He calculated the minimum amount of entropy needed to erase one bit of information, a result now known as the Landauer limit. For the curious, erasing one bit of information at room temperature generates 3 x 10-21 joules, about a hundred trillion trillionth of the heat energy needed to boil a kettle. That’s not a lot of heat, but it establishes an important principle. By demonstrating a link between logical operations and heat generation, Landauer found a deep connection between physics and information, not in the rather abstract demonic sense of Szilárd, but in the very specific (that is, dollar-related) sense in which it is understood in today’s computing industry.6

From Landauer on, information ceased to be a vaguely mystical quantity and became firmly anchored in matter. To summarize this transformation in thinking, Landauer coined a now-famous dictum: ‘Information is physical!’7 What he meant by this is that all information must be tied to physical objects: it doesn’t float free in the ether. The information in your computer is stored as patterns on the hard drive, for example. What makes information a slippery concept is that the particular physical instantiation (the actual substrate) often doesn’t seem important. You can copy the contents of your hard drive onto a flash drive, or relay it via Bluetooth, or send it in laser pulses down a fibre or even into space. So long as it is done properly, the information stays the same when it is transferred from one variety of physical system to another. This independence of substrate seems to give information ‘a life of its own’ – an autonomous existence.

In this respect, information shares some of the properties of energy. Like information, energy can be passed from one physical system to another and, under the right conditions, it is conserved. So would one say that energy has an autonomous existence? Think of a simple problem in Newtonian mechanics: the collision of two billiard balls. Suppose a white ball is skilfully propelled towards a stationary red ball. There is a collision and the red ball flies off towards a pocket. Would it be accurate to say that ‘energy’ caused the red ball to move? It is true that the kinetic energy of the white ball was needed to propel the red ball, and some of this energy was passed on in the collision. So, in that sense, yes, energy (strictly, energy transfer) was a causative factor. However, physicists would not normally discuss the problem in these terms. They would simply say that the white ball hit the red ball, causing it to move. But because kinetic energy is instantiated in the balls, where the balls go, the energy goes. So to attribute causal power to energy isn’t wrong, but it is somewhat quixotic. One could give a completely detailed and accurate account of the collision without any reference to energy whatsoever.

When it comes to information, are we in the same boat? If all causal power is vested in the underlying matter in which information is instantiated, it might be regarded as equally quixotic, albeit convenient, to discuss information as a cause. So is information real, or just a convenient way to think about complex processes? There is no consensus on this matter, though I am going to stick my neck out and answer yes, information does have a type of independent existence and it does have causal power. I am led to this point of view in part by the research I shall describe in the next chapter involving tracking shifting patterns of information in networks that do indeed seem to obey certain universal rules transcending the actual physical hardware in which the bits are instantiated.

READING THE MIND OF THE DEMON

If Landauer’s limit is truly fundamental, then it must also apply to the information processed in the demon’s brain. Landauer didn’t pursue that line of inquiry himself, however. It took another IBM scientist, Charles Bennett, to investigate the matter, twenty years later. The prevailing view was still that the demon can’t violate the second law of thermodynamics because any entropy-reducing advantage gained by its antics is negated by the entropy-generating cost of sensing the molecules in the first place. But reflecting deeply on this matter led Bennett to suspect there was a flaw in the received wisdom. He worked out a way to detect the state of a molecule without generating any entropy at all.fn14 If the second law is to be saved, Bennett argued, then the compensating entropy cost must come from somewhere else. At first sight there was a ready answer: the irreversible nature of computation – the merging of numbers needed to output an answer. That would certainly produce heat if carried out directly. But even here Bennett found a loophole. He pointed out that all computations can in fact be made reversible. The idea is simple. In the pencil-and-paper example I gave it would be enough to merely keep a record of the input and all the intermediate steps in order to run the long division backwards. You could easily begin with the answer and finish by outputting the question, because everything you need is there on the paper. In an electronic computer the same thing can be achieved with specially designed logic gates, wired together to form circuits in such a way that all the information is retained somewhere in the system. With this set-up, no bits are erased and no heat is produced; there is no rise in entropy. I should stress that today’s computers are very far indeed from the theoretical possibility of reversible computation. But we are dealing here with deep issues of principle, and there is no known reason why the theoretical limit may not one day be approached.

Now we are back to square one as far as the demon is concerned. If it can acquire information about molecules at negligible entropic cost, process the information reversibly in its tiny brain and effortlessly operate a shutter, then by repeating the process again and again the demon would be able to generate perpetual motion.

What’s the catch? There is one, and according to Bennett it is buried in the ‘again and again’ qualification.8 Let’s take stock: the demon has to process the acquired information to operate the mechanism correctly. The processing could, in principle, be carried out reversibly, producing no heat, but only if the demon retains all the intermediate steps of the computation in its memory. Fine. But if the demon repeats the trick, more information will be added, and yet more on the next cycle, and so on. Over time, the demon’s internal memory will inexorably clog up with bits of information. So the sequence of computations can all be reversed so long as there is enough memory space. To operate in a truly open-ended manner a finite demon needs to be brainwashed at the end of each cycle; that is, the demon’s memory must be cleared and its state reset to the one it had at the outset before it embarks on the next cycle. And this step proved to be the demon’s Achilles heel. Bennett proved that the act of information erasure generates just the right amount of entropy to pay for the apparent violation of the second law attained by the demon’s activities.

Nevertheless, the subject of the demon continues to attract dissent and controversy. For example, what happens if one has an endless supply of demons, so when one gets its brain clogged, another is substituted? Also, a more general analysis suggests that a demon could be made in which the sum of the entropy of observation and erasure entropy can never be less than the Landauer bound. In this system, the entropic burden can be distributed between observation and erasure in any mix.9 Many open questions remain.

INFORMATION ENGINES

The way I’ve described it, there’s still something a bit magical about how the demon operates as an intelligent agent. Surely it doesn’t have to be sentient, or even intelligent in the everyday IQ sense? It must be possible to substitute a mindless gadget – a demonic automaton – that would serve the same function. Recently, Christopher Jarzynski at the University of Maryland and two colleagues dreamed up such a gadget, which they call an information engine. Here is its job description: ‘it systematically withdraws energy from a single thermal reservoir and delivers that energy to lift a mass against gravity while writing information to a memory register’.10 Though hardly a practical device, their imaginary engine provides a neat thought experiment to assess the three-way mix of heat, information and work, and to help us discover their relative trade-offs.

Image
Fig. 5. The three-way trade-off of information, heat energy and work. Maxwell’s and Szilárd’s demons process information to convert heat into work. Information engines do work by turning information into heat or by dumping entropy into an empty information register. Conventional engines use heat to do work and thereby destroy information (i.e. create entropy).

The Jarzynski contraption resembles a child’s plaything (see Fig. 6). The demon itself is simply a ring that can rotate in the horizontal plane. A vertical rod is aligned with the axis of the ring, and attached to the rod are paddles perpendicular to the rod which stick out at different angles, like a mobile, and can swivel frictionlessly on the rod. The precise angles don’t matter; the important thing is whether they are on the near side or the far side of the three co-planar rods as shown. On the far side, they represent 0; on the near side, they represent 1. These paddles serve as the demon’s memory, which is just a string of digits such as 01001010111010 … The entire gadget is immersed in a bath of heat so the paddles randomly swivel this way and that as a result of the thermal agitation. However, the paddles cannot swivel so far as to flip 0s into 1s or vice versa, because the two outer vertical rods block the way. The show begins with all the blades above the ring set to 0, that is, positioned somewhere on the far side as depicted in the figure; this is the ‘blank input memory’ (the demon is brainwashed). The central rod and attached paddles now descend vertically at a steady pace, bringing each paddle blade one at a time into the ring and then exiting below it. So far it doesn’t look as if anything very exciting will happen. But – and this is an important feature – one of the vertical rods has a gap in it at the level of the ring, so now as each blade passes through the ring it is momentarily free to swivel through 360 degrees. As a result, each descending 0 has a chance of turning into a 1.

Image
Fig. 6. Design for an information engine. In this variant of Maxwell’s demon experiment, the central rod descends continuously through the ring. Two fixed vertical rods lie on either side of, and are co-planar with, the central rod. One of these has a gap at the level of the ring. Identical paddles attached to the central rod are free to swivel horizontally; their positions encode 0 or 1 respectively, depending on whether they lie on the far side or near side of the three rods as viewed in the figure. In the configuration shown, the initial state consists of all 0s. The horizontal ring serves as a simple demon. It can rotate freely in the horizontal plane. It has a single projecting blade designed so that it can be struck by the swivelling paddle blades sending it either clockwise or anticlockwise. The entire gadget is immersed in a bath of heat and so all components will experience random thermal fluctuations. Because of the gap in the left-hand rod there are more anticlockwise blows to the ring than clockwise (as viewed from above). The device therefore converts random thermal motion into directional rotation that could be used to raise a weight, but in so doing the output state of the blades (their configuration below the ring; none shown here) is now a random mixture of 1s and 0s. The machine has thus converted heat into work and written information into a register.

Now for the crucial part. For the memory to be of any use to the demon, the descending blades need to somehow interact with it (remember that, in this case, the demon is the ring) or the demon cannot access its memory. The interaction proposed by the designers is very simple. The demonic ring comes with a blade of its own which projects inwards and is fixed to the ring; if one of the slowly descending paddles swivels around in the right direction its blade will clonk the projecting ring blade, causing the ring to rotate in the same direction. The ring can be propelled either way but, due to the asymmetric configuration of the gap, there are more blows sending the ring anticlockwise than clockwise (as viewed from above). As a result, the random thermal motions are converted into a cumulative rotation in one direction only. Such progressive rotation could be used in the now familiar manner to perform useful work. For example, the ring could be mechanically coupled to a pulley in such a way that anticlockwise movement of the ring would raise a weight and clockwise movement lower it. On average, the weight will rise. (If all this sounds too complicated to grasp, there is a helpful video animation available.)11

So what happened to the second law of thermodynamics? We seem once more to be getting order out of chaos, directed motion from randomness, heat turning into work. To comply with the second law, entropy has to be generated somewhere, and it is: in the memory. Translated into descending blade configurations, some 0s become 1s, and some 0s stay 0s. The record of this action is preserved below the ring, where the two blocking rods lock in the descending state of the paddles by preventing any further swivelling between 0 and 1. The upshot is that Jarzynski’s device converts a simple ordered input state 000000000000000 … into a complex, disordered (indeed random) output state, such as 100010111010010 … Because a string of straight 0s contains no information, whereas a sequence of 1s and 0s is information rich,fn15 the demon has succeeded in turning heat into work (by raising the weight) and accumulating information in its memory. The greater the storage capacity of the incoming information stream, the larger the mass the demon can hoist against gravity. The authors remark, ‘One litre of ordinary air weighs less than half a US penny, but it contains enough thermal energy to toss a 7kg bowling ball more than 3m off the ground. A gadget able to harvest that abundant energy by converting the erratic movement of colliding molecules into directed motion could be very useful indeed.’12 And so it would. But just like Maxwell’s and Szilárd’s demons, Jarzynski’s demon can’t work repetitively without clearing the memory and erasing information, a step that inevitably raises the entropy.

In fact, Jarzynski’s engine can be run in reverse to erase information. If instead of straight 0s, the input state is a mixture of 1s and 0s (representing information), then the weight descends, and in so doing it pays for the erasure with its own gravitational potential energy. In this case, the output has more 0s than the input. The designers explain: ‘When presented with a blank slate the demon can lift any mass; but when the incoming bit stream is saturated [with random numbers] the demon is incapable of delivering work … Thus a blank or partially blank memory register acts as a thermodynamic resource that gets consumed when the demon acts as an engine.’13 This is startling. If erasing information increases entropy, then acquiring an empty memory amounts to an injection of fuel. In principle, this tabula rasa could be anything at all – a magnetic computer memory chip or a row of 0s on a paper tape. According to Jarzynski, 300 billion billion zer0s could lift an apple, demonically, by one metre!

The notion that the absence of something (a blank memory) can be a physical resource is reminiscent of the Improbability Drive in The Hitchhiker’s Guide to the Galaxy.14 But weird though it seems, it is the inevitable flip side to Charles Bennett’s analysis. No doubt the reader is nonplussed at this stage. Can a string of zer0s really run an engine? Can information itself serve as a fuel, like petrol? And is this just a collection of mind games, or does it connect to the real world?

DEMONS FOR DOLLARS: INVEST NOW
IN APPLIED DEMONOLOGY

One hundred and forty years after Maxwell first floated the idea, a real Maxwell demon was built in the city of his birth. David Leigh and his collaborators at Edinburgh University published the details in a paper in Nature in 2007.15 For over a century the idea of anyone actually building a demon seemed incredible, but such have been the advances in technology – most significantly in nanotechnology – that the field of applied demonology has at last arrived.fn16

The Leigh group built a little information engine consisting of a molecular ring that can slide back and forth on a rod with stoppers at the end (like a dumbbell). In the middle of the rod is another molecule that can exist in two conformations, one that blocks the ring and one that allows it to pass over the blockage. It thus serves as a gate, similar to Maxwell’s original conception of a moveable shutter. The gate can be controlled with a laser. The system is in contact with surroundings that are maintained at a finite temperature, so the ring will jiggle randomly back and forth along the rod as a result of normal thermal agitation. At the start of the experiment it is confined to one half of the rod with its movement blocked by the ‘gate’ molecule set to the ‘closed’ position. The researchers were able to follow the antics of the ring and gate in detail and test that the system really is driven away, demon-like, from thermodynamic equilibrium. They confirmed that ‘information known to a gate-operating demon’ serves as a fuel, while its erasure raises entropy ‘in agreement with Bennett’s resolution of the Maxwell demon paradox’.16

The Edinburgh experiment was quickly followed by others. In 2010 a group of Japanese scientists manipulated the thermal agitation of a tiny polystyrene bead and announced, ‘We have verified that information can indeed be converted to potential energy and that the fundamental principle of the demon holds true.’17 The experimenters reported that they were able to turn information into energy with 28 per cent efficiency. They envisaged a future nano-engine that runs solely on ‘information fuel’.

A third experiment, performed by a group at Aalto University in Finland, got right down to the nano-scale by trapping a single electron in a tiny box just a few millionths of a metre across held at a low but finite temperature. The electron started out free to visit one of two locations – just as with the box in Szilárd’s engine. A sensitive electrometer determined where the electron resided. This positional information was then fed into a device that ramped up the voltage (a reversible operation with no net energy demand) so as to trap the electron in situ – analogous to the demon inserting the screen. Next, energy was slowly extracted from the electron’s thermal movement and used to perform work. Finally, the voltage was returned to its starting value, completing the cycle. The Finnish team carried out this experiment 2,944 times, attaining an average of 75 per cent of the thermodynamic limit of a perfect Szilárd engine. Importantly, the experiment is an autonomous Maxwell demon, ‘where only information, not heat, is directly exchanged between the system and the demon’.18 The experimenters themselves didn’t meddle in the process and, indeed, they didn’t even know where the electron was each time – the measurement and feedback control activity were entirely automatic and self-contained: no external agent was involved.

In a further refinement, the Finnish team coupled two such devices together, treating one as the system and the other as the demon. Then they measured the demonically extracted heat energy by monitoring the cooling of the system and the corresponding heating up of the demon. They touted this nanotechnological feat as the creation of the world’s first ‘information-powered refrigerator’. Given the pace of technological advancement, demonic devices of this sort will likely become available by the mid-2020s.fn17 Expect a big impact on the commercialization of nanotechnology, but probably a smaller impact on kitchen appliances.

ENGINES OF LIFE: THE DEMONS IN YOUR CELLS

‘Information is the currency of life.’

Christoph Adami19

Though Maxwell would doubtless have been delighted to see the advent of practical demonology, he could scarcely have guessed that the interplay of information and energy involved has been exploited by living organisms for billions of years. Living cells, it turns out, contain a host of exquisitely efficient and well-honed nano-machines, made mostly from proteins. The list includes motors, pumps, tubes, shears, rotors, levers and ropes – the sort of paraphernalia familiar to engineers.

Here is one amazing example: a type of turbine consisting of two aligned rotors coupled by a shaft. (Its function in living cells is to play a role in energy transport and storage.) The rotor turns when protons (there are always plenty of them roaming around inside cells) traverse the shaft in one direction. If the rotor is run backwards, it pumps out protons in the reverse direction. A Japanese group conducted an ingenious experiment in which one of the rotors was extracted and anchored to a glass surface for study. They attached a molecular filament to the end of the shaft and tagged it with a fluorescent dye so it could be seen under a light microscope when a laser was shone on it. They were able to watch the rotor turn in discrete jumps of 120 degrees each time a proton transited.20

Another tiny biomachine that has attracted a lot of attention is a sort of freight-delivery molecule called kinesin. It carries vital cargoes by walking along the tiny fibres which criss-cross cells. It does this gingerly – one careful step at a time – to avoid being swept away by the incessant bombardment from the thermally agitated water molecules that saturate all living cells and move twice as fast as a jetliner. One foot stays anchored to the fibre and the other comes from behind and sets down ahead; then the process is repeated with the other foot. The anchor points are where the binding forces between the foot and the fibre are especially propitious: those sites are 8 nanometres apart, so each step is 16 nanometres in length. It’s unnerving to think that billions of these little kinesin wonders are creeping around inside you all the time. Readers should check out an entertaining YouTube cartoon showing kinesin strutting its stuff.21 (Those who want more technical details should read Box 4.)

An obvious question is what makes this mindless molecular machine exhibit patently purposeful progress? If it simply lifts a foot, then thermal agitation will propel it forwards and backwards at random. How does it plough doggedly ahead in the teeth of the relentless molecular barrage? The answer lies in the way that kinesin acts as a form of ratchet (one foot always anchored, remember). Molecular ratchets are a good example of demons, which are basically in the business of using information to convert random thermal energy into directional motion.fn18 But, to avoid falling foul of the second law, kinesin must tap into a power source.

Box 4: How kinesin can walk the walk

ATP – life’s miracle fuel – is converted into a related molecule called ADP (adenosine diphosphate) when it gives up its energy. ADP can be ‘recharged’ to ATP, so ATP is recycled rather than discarded when it has delivered its energy. ATP and ADP are critical to the operation of the kinesin walker. The kinesin has a little socket in the ‘heel’ of each foot shaped exactly so that an ADP molecule can fit into it snugly and bind to it. When the slot is thus occupied, the shape of the leg changes a little, causing the foot to detach from the fibre, when it is then free to move around. When the loose foot locates the next anchor point it releases the ADP from its slot, causing the foot to bind once more to the fibre. While this foot-loose activity is going on, the other foot (the one initially in front) had better hang onto the fibre: if both feet came free together, the kinesin molecule would drift away and its cargo would be lost. The other foot – now the back foot – will stay anchored so long as its own ADP slot remains empty. But will it? Well, the very same heel slot that binds ADP can also accommodate ATP. If a randomly passing ATP encounters the empty slot of the anchored back foot, it will snap into it. Then three things happen. First, the kinesin molecule deforms and reorients in such a way as to frustrate any attempt by passing ATPs to fill the now-empty slot of the front foot. Second, ATP contains stored chemical energy. In the slot it undergoes a chemical transformation ATP ADP and thereby releases its energy into the little kinesin machine. The resulting kick contributes to driving the machine, but also – thirdly – the conversion to ADP means that the back foot now contains an ADP molecule in its slot, as a result of which it detaches from the fibre and begins the process of walking forward so the cycle can be repeated.22

Let me digress a moment to explain the energetics here, as it is important more generally. Biology’s fuel of choice is a molecule called ATP (adenosine triphosphate); it’s like a mini-power-pack with a lot of punch and it has the useful feature that it can keep its energy stored until needed, then – kerpow! Life is so fond of ATP fuel for its myriad nano-machines (like the abovementioned rotor, for example) it’s been estimated that some organisms burn through their entire body weight of the stuff in just one day.

Life uses many ratchets. The kinesin walker is one example designed to go forwards, not forwards and backwards equally. Looking at the physics of ratchets subject to thermal fluctuations leads to a clear conclusion. They work only if there is either a source of energy to drive them in one direction or active intervention by an information-processing system (demon). No fuel, or no demon, means no progress. Entropy is always generated: in the former case from the conversion of the driving energy into heat; in the latter from the entropy of information processing and memory erasure. There is no free lunch. But by ratcheting ahead instead of simply ‘jet-packing’ its cargo through the molecular barrage, the lunch bill for kinesin is greatly reduced.

Box 5: Feynman’s ratchet

An attempt to replace Maxwell’s demon by a purely passive device was made by Richard Feynman. He considered a ratchet of the sort employed by mechanical clockwork (needed so the hands don’t turn anticlockwise; see Fig. 7). It involves a toothed wheel with a spring-loaded pawl to stop the wheel slipping backwards. Critical to the operation of the ratchet is the asymmetry of the teeth: they have a steep side and a shallow side. This asymmetry defines the direction of rotation; it is easier for the pawl to slide up the shallow edge of a tooth than the steep edge. Feynman then wondered whether, if the ratchet were immersed in a heat bath maintained at uniform temperature, random thermal fluctuations might occasionally cause the wheel to advance in the forward direction (clockwise in the diagram) but not in the reverse direction. If the ratchet were attached to a rope, it would be able to lift a weight, thus doing useful work powered only by heat. Not so. The flaw in the argument rests with the spring-loaded pawl. In thermodynamic equilibrium it too will jiggle about due to thermal fluctuations, sometimes causing the wheel to slip back the wrong way. Feynman calculated the relative probabilities of forward and reverse motion of the ratchet wheel and argued that, on average, they balanced out.26

Image
Fig. 7. Feynman’s ratchet. In this thought experiment gas molecules bombard the vanes, causing the shaft to rotate randomly clockwise or anticlockwise. If it jerks clockwise, the ratchet permits the shaft to turn, thus raising the weight. But if the shaft tries to rotate anticlockwise, the pawl blocks it. Thus the device seems to convert the heat energy of the gas into work, in violation of the second law of thermodynamics.

Now for an arresting postscript. There’s another make of walker called dynein that, in what seems like a fit of designer madness, walks the other way along the same fibres as kinesin. The inevitable encounters between kinesin and dynein lead to occasional episodes of road rage and call for some deft manoeuvring. There are even road blocks stationed along the fibres requiring side-stepping and other molecular dances. Yet biology has solved all these problems with remarkable ingenuity: by exploiting demonic ratcheting, kinesin operates at an impressive 60 per cent efficiency in regard to its ATP fuel consumption. (Compare a typical car engine, which is about 20 per cent efficient.)

The core of life’s machinery revolves round DNA and RNA, so it’s no surprise that nature has also honed the minuscule machines that attend them to also operate at high thermodynamic efficiency. One example is an enzyme known as RNA polymerase, a tiny motor whose job is to crawl along DNA and copy (transcribe) the digital information into RNA, letter by letter. The RNA strand grows as it goes, adding matching letters at each step. It turns out that this mechanism comes very close to the theoretical limit of being a Maxwell demon, consuming almost no energy. We know it isn’t quite 100 per cent accurate because there are occasional errors in the transcript (which is good: remember, errors are the drivers of Darwinian evolution). Errors can be corrected, however, and they mostly are. Life has devised some amazingly clever and efficient ways to read the RNA output and fix up goofs.fn19 But in spite of all this striving to get it right, there is a very basic reason why RNA error correction can’t be perfect: there are many ways for the transcription to be wrong but only one way for it to be right. As a result, error-correction is irreversible; you can’t infer the erroneous sequence from the corrected sequence. (This is another example of not being able to deduce the question from the answer.) Logically, then, the error-correcting process merges many distinct input states into a single output state, which, as we know from Landauer’s work, always carries an entropy cost (see here).

A different demonic motor springs into action when the cell divides and DNA is duplicated. Called DNA polymerase, its job is to copy from one DNA strand into another, daughter molecule, which again is built one letter at a time as the motor crawls along. It typically moves at about 100 base pairs per second and, like RNA polymerase, it also operates at close to thermodynamic perfection. In fact, it is possible to run this mechanism in reverse by the simple expedient of tensioning the DNA. In the lab this can be done with devices called optical tweezers. As the tension increases, so the enzyme crawls more slowly until, at a tension of about 40pN it stops completely. (pN here stands for ‘pico-newton’, or a trillionth of a newton, a standard unit of force named after the great man himself.) At higher tensions, the tiny motor goes backwards and undoes its handiwork letter by letter.23

Copying DNA is of course only a small part of the process of reproduction whereby a cell divides into two. An interesting engineering question is how much the whole cell-reproduction process costs, energy/entropy-wise. Jeremy England of MIT analysed this topic with bacteria,24 which hold the world record for rapid reproduction (twenty minutes going flat out). Given what I have explained about heat and entropy, the question arises of whether or not the bacteria grow hot as a result. Well, they do, but not as hot as you might imagine from all that pushing, pulling and molecular rearrangement. According to England, E. coli generate only about six times the heat of the theoretical minimum limit set by thermodynamics, so they are almost as efficient on the cellular level as they are on the nano-machine level.fn20

How can we explain the astonishing thermodynamic efficiency of life? Organisms are awash with information, from DNA up to social organization, and it all comes with an entropy cost. No surprise, then, that evolution has refined life’s information-management machinery to operate in a super-efficient manner. Organisms need to have perfected the art of storing and processing information or they would quite simply cook themselves to death with waste heat.

Though life’s nano-machines, on the face of it, obey the same laws of physics as familiar macroscopic machines, the environment in which they operate is starkly different. A typical mammalian cell may contain up to 10 billion proteins, which places them on average only a few nanometres apart. Every nano-machine is continually buffeted by the impact of high-speed water molecules, which make up much of the cell’s mass. Conditions resemble those in a rowdy nightclub: crowded and noisy. The little machines, unless anchored, will be knocked all over the place. Such mayhem might seem like a problem for the smooth running of the cellular machinery, but it can also be a positive advantage. After all, if the interior of the cell were frozen into immobility, nothing at all would happen. But there is a downside to the incessant thermal disruption: life must expend a lot of effort repairing the damage and rebuilding disintegrating structures.

One way to think about thermal noise is in terms of average energy of molecular motion. Big molecules such as proteins move more slowly than water molecules, but as they are much more massive (a typical protein weighs as much as 10,000 water molecules) they carry about the same amount of energy. Thus, there is a natural unit of energy at any given temperature: at room temperature it is about 3 x 10−21 joules. This will be the energy of a typical molecule. It also happens to be about the same as the energy needed to deform the shapes of important molecular structures like kinesin, and furthermore, that needed to unravel or fracture molecules. Much of life’s machinery thus teeters on the edge of heat destruction. Again, this seems like a problem, but it is in fact vital. Life is a process, and the disruption wrought by the unrelenting molecular clamour provides an opportunity for rearrangement and novelty. It also makes the conversion between one form of energy and another easy. For example, some biological nano-machines turn electrical energy into motion; others turn mechanical energy into chemical energy.

The reader might be wondering why so many vital processes take place invisibly, on a nanoscale, under such trying and extreme conditions. The aforementioned coincidence of energy scales provides a ready answer. For life as we know it, liquid water plays a critical role, and that brackets the temperature range at which biology can operate. It turns out to be only at the nano-scale that the thermal energy in this temperature range is comparable to the chemical and mechanical energy of the biological machinery, and thus able to drive a wide range of transformations.25

BEYOND THE BIT

Living organisms, we now know, are replete with minuscule machines chuntering away like restless Maxwell demons, keeping life ticking over. They manipulate information in clever, super-efficient ways, conjuring order from chaos, deftly dodging the strictures of thermodynamics’ kill-joy second law. The biological information engines I have described, and their technological counterparts, involve simple feedback-and-control loops. Although the actual molecules are complex, the logic of their function is simple: just think of kinesin, tirelessly working away at the ‘molecular coalface’.

The cell as a whole is a vast web of information management. Consider, for example, the coded information on DNA. Making proteins is a complicated affair, over and above the mRNA transcription step. Other proteins have to attach the right amino acids to strands of transfer RNA, which then bring them to the ribosome for their cargoes to be hooked together on cue. Once the chain of amino acids is completed, it may be modified by yet other proteins in many different ways, which we’ll explore in Chapter 4. It must also fold into the appropriate three-dimensional structure, assisted by yet more proteins that chaperone the flexible molecule during the folding process. All this exquisite choreography has to work amid the thermal pandemonium of the cell.

On its own, the information in the gene is static, but once it is read out – when the gene is expressed as the production of a protein – all manner of activity ensues. DNA output is combined with other streams of information, following various complex pathways within the cell and cooperating with a legion of additional information flows to produce a coherent collective order. The cell integrates all this information and progresses as a single unit through a cycle with various identifiable stages, culminating in cell division. And if we extend the analysis to multicelled organisms, involving the astounding organization of embryo development, then we are struck even more forcibly that simply invoking ‘information’ as a bland, catch-all quantity, like energy, falls far short of an explanation for what is going on.

This is where Shannon’s definition of information, important though it is, fails to give a complete account of biological information. It is deficient in two important respects:

  1. Genetic information is contextual. Shannon himself was at pains to point out that his work dealt purely with transmitting bits of information defined in the most economical manner and had nothing to say about the meaning of the message encoded. The quantity of Shannon information is the same whether a DNA sequence encodes instructions to build a protein or is just arbitrary ‘junk’ DNA. But the consequences for biological functionality are profound: a protein will fulfil a vital task; junk will do nothing of significance. The difference is analogous to Shakespeare versus a random jumble of letters. For genetic information to attain functionality, there has to be a molecular milieu – a global context – that recognizes the instructions and responds appropriately.
  2. Organisms are prediction machines. At the level of the organism as a whole, information is gathered from an unpredictable and fluctuating environment, manipulated internally and an optimal response initiated. Examples include a bacterium swimming towards a source of food and ants exploring their surroundings to choose a new nest. This process has to work well or the consequences are lethal. ‘Organisms live and die by the amount of information they acquire about their environment,’ as Andreas Wagner expresses it.27 Being a good prediction machine means having the ability to learn from experience so as to better anticipate the future and make a smart move. To be efficient, however, a predictive system has to be choosy about what information it stores; it would be wasteful to remember everything. All this requires some sort of internal representation of the world – a type of virtual reality – incorporating sophisticated statistical assessments.28 Even a bacterium is a wiz at mathematics, it seems.

Summarizing these higher functions, we might say that biological information is not merely acquired, it is processed. Shannon’s information theory can quantify the number of bits in a cell or an entire organism, but if the name of the game is information processing, then we need to look beyond mere bits and appeal to the theory of computation.

Living organisms are not just bags of information: they are computers. It follows that a full understanding of life will come only from unravelling its computational mechanisms. And that requires an excursion into the esoteric but fascinating foundations of logic, mathematics and computing.