Lost Illusions, or From Intellectronics to Informatics

1

The disillusion in cybernetics in the two decades since its birth has been due to both practical and theoretical reasons. Since the theoretical reasons are more fundamental and also more difficult to articulate, I start with them. True, the “fathers” of cybernetics—Wiener, Shannon, and von Neumann—at the very beginning warned against the excessive optimism of treating cybernetics as a universal key to knowledge. But they themselves could not always avoid slipping into just such optimism.

The maximum epistemological program of cybernetics declared—nota bene more in popular accounts than in the scientific literature—that a new language, a new system of abstraction, and a new level of generalization had risen, allowing for the unification of the natural sciences and humanities, which until now had been separated by an impassable barrier (biology, geology, and physics on one side; anthropology, psychology, linguistics, sociology, and even literary studies on the other). Cybernetics could accomplish this because it had at its disposal models with such a high level of abstraction that they applied to a great variety of phenomena in the diverse sciences while they still preserved the identity of its central concepts—information, its sender, its receiver, and its transmission channel; a system equipped with “inputs” and “outputs”; negative and positive feedback; system trajectories determined by transformation matrices; and so on. These mathematically defined concepts were supposed to become the common denominator of all disciplines, and they would allow rigorous research in areas that had not been accessible to exact methods.

This promise was not fully realizable from the start, for two reasons. The first relates to the inadequacies of cybernetics itself—about which later. The second has to do with the difficulties that have plagued twentieth-century mathematics. Because historically mathematics privileged and adopted into its methodological arsenal disciplines that did not come from the mainstream of pure mathematics but developed somehow alongside it, such as the theory of probability, the basis of Shannon’s information theory, which the pure mathematicians treated for a long time like a stepchild, and the theories of algorithms and of systems, which, not having achieved full formalism, operate with concepts that a mathematician regards with suspicion. Lacking the expert knowledge, we may resort to a metaphor: the mathematical foundation on which cybernetics was set was not completely solid. Probability theory and algorithm theory (in contrast with systems theory, so important for cybernetics, which is still more wishful thinking or a set of sketches and propositions than a true self-standing discipline),1 both possess clearly defined centers and peripheries swarming with unresolved questions and doubts. Attempts to expand the centers have led to enormous difficulties that relate to the ambiguity of the two central terms, “probability” and “algorithm.” In both these disciplines we can do either very little but with absolute certainty or a great deal but with little certainty. Yet using just the theories of probability and algorithms was insufficient for developing cybernetics, and it was expected that pure mathematics would provide a helping hand; among others, von Neumann expressed such a hope, seeing the inadequacy of the solid but not sufficiently powerful combinatorial procedures stemming from Boolean algebra, which had been resurrected and was suddenly fashionable.2 Unfortunately, help did not come. Besides, mathematics had a hovering cloud of its own curses to contend with, related to infinity, because physics, especially classical physics, was based on infinitesimal calculus, but an infinitesimal apparatus embedded in set theory is of no use to cybernetics. This clash between the finite and the infinitesimal has led to fundamental questions about the nature of cybernetics. Is it a branch of pure mathematics, or is it a mathematically interpreted modeling arena of physics?

This question is not as speculatively barren as it may seem. As a part of physics, cybernetics would have an empirical origin and, constituting a theory or set of theories, it would necessarily have to be subject to experimental testing. As a part of mathematics, cybernetics would be a generator of models and structures that are true by definition provided that they were formed in a consistent manner and the question of the epistemological fruitfulness of its applications outside mathematics would be a separate issue, rather peripheral to a cyberneticist. We note that the cyberneticists themselves could not make up their minds regarding the status of their discipline. A Freudian historian of science might say that they were not willing to give up either the deductive truthfulness that is the hallmark of mathematics or the instrumental fruitfulness that characterizes natural sciences, and thus it was their “subconscious” that rendered the act of classification difficult for them. The “fathers” were all mathematicians by education, but only Wiener was close to pure mathematics; von Neumann was involved in a little of everything—quantum mechanics, the theories of automata and information, chemistry, biology, and even neurophysiology; while Shannon was a communication engineer.

The links were unclear, and still are, between cybernetics in the exact sense—which includes the theory of systems that possess input/output and feedback, along with all possible variations in the areas of homeostasis, self-organization, etc.—and the relatively independent fields, such as Shannon’s information theory, and those with even looser ties to cybernetics, such as the theories of dynamic programming, decision-making, and organization. From the beginning, the main criticism of cybernetics was that it did not uncover anything new but only translated into its own language the systems and processes that were already well described in other languages; that cybernetics was therefore sterile. It is true that in many disciplines the conceptual apparatus of cybernetics proved unproductive. Cybernetics explains a lot—for example, in theoretical biology—but led to no substantial discoveries. Not that its applications were wrong; they were only sometimes premature and sometimes unimportant, when, for instance, the dearth of appropriate objective data in a given science made it impossible to fully flesh out a newly introduced cybernetic model.

It later turned out that information theory was inadequate to have the universal application that had been postulated with so much fanfare and that the notion of information, even the asemantic information of the Shannon type, was extremely difficult to use rigorously beyond human-made communication systems. I will devote a few words to this important issue. The developmental problems of cybernetics were not limited to information concepts, essential as they were, because it was hard to talk with rigor about regulation and communication in a system or a machine without having adequate tools at hand for reliable measurement of this communication and regulation along with the information involved.

The greatest joy, almost as if at the discovery of a modern philosopher’s stone, was elicited by the equating of the communications information with thermodynamic entropy, which, as the “fathers” (e.g., von Neumann) claimed, built a bridge between logic and physics—for the first time in the history of knowledge. It was discovered that information is “negative entropy” or “negentropy,” the reverse of entropy as a physical quantity that measures the “energy deterioration” or probabilistically understood “degree of disorder” in a system (because the highest order is always thermodynamically the least probable state of a physical system). But the joy quickly faded when it turned out that the optimism had been greatly exaggerated.

At first it was thought that the inadequacy of the physicalist concept of information was due to its asemanticity or the fact that Shannon’s theory did not consider the content of information. There were attempts to construct a semantic information theory (e.g., Carnap and Bar Hillel).3 But it quickly became clear that even the purely orthodox theory of information, the communications type derived from thermodynamics, suffered from inadequacies. This was discovered when researchers in different fields tried to estimate the information contained in a living organism, in an egg, a chromosome or gene set of a biocenotic population,4 or just to determine which contains more information, a zygote or the organism that arises from it. One of the most brilliant linguists, logicians, and informaticians (all at the same time), Yehoshua Bar Hillel, eventually showed that questions in the form “What is x?” (i.e., “What is information?”) are not permitted in science, that they have a metaphysical bent, as it were: they assume an “ultimate” answer, which would provide access to the “nature” of an “entity” like information or gravitation, to something permanent and unchangeable, something that science, constantly self-correcting and evolving, can never attain. His postulate has precedents. For example, there exist questions in quantum mechanics that are prima facie legitimate—that is, logically correct—but must not be asked nevertheless, as the answers to them would be self-contradictory and therefore worthless.

The radicalism of Bar Hillel’s view unavoidably brings about crushing consequences: we cannot know what information really is, whether it belongs to the category of concepts like energy and mass or is a “special entity”; worse, we are losing the sense of the scope of its applicability, because if information, operationally undefined, is not a reasonable measure of anything in biology or psychology, then it cannot be used beyond the narrow domain of communications engineering technology, that is, transmitters, channels, and receivers, except in linguistics. Linguistics is saved for informatics because it can be treated purely combinatorially and probabilistically at the same time. Yet a full mathematization of linguistics is very difficult if not impossible, so we must satisfy ourselves with a heuristically legitimate approximation: for example, if we want to measure the frequency of certain letters in a natural language, we can obtain an answer with an arbitrary degree of precision but without a closed theoretical formula, because the formula depends on observing the speakers of the language, and we are back at the good old dichotomy between empirical probability and the mathematical expectation theory.

But limiting the informaticists, if they are linguists, to the study of natural languages would be damaging, as it would cut off any chance of a parallel investigation of that “other” language, the genetic code in biology. The empirically evident similarity between genetic messages and language utterances raised great epistemological hopes. It seemed possible to achieve a new type of generalization, in a synthesis of which the concept of information would play the leading role. It seemed that information, originating simultaneously from the fields of logic and thermodynamics, would enable a unified approach to both natural languages, which serve interpersonal communication, and effectory languages, i.e., self-realizing prognoses that are the chromosomal articulations of living organisms. To the empirical natural scientists, everything pointed to the common roots of these two classes of language, but the theories that were supposed to underpin this happy synthesis and open the door to future research proved to be too weak.

The time bomb thrown into this field of inquiry was the concept of complexity. The whole theory of systems is being built just to clarify this foggy notion which refused to submit to any precisification, especially of the mathematical kind. There were attempts to bridge the concepts of physical information and complexity so that a measure of information would automatically give the degree of complexity of the object being studied. Relics of those attempts can be found in terms like structural, metric, topological, and algorithmic information, which go beyond probability and combinatorics. Unfortunately, they all brought more disappointment than success. Let me explain in a brief, accessible, and therefore inevitably sketchy manner why the notions of information and complexity were incompatible.

To many, information suggests a subjectivity that is foreign to physics. The measure of information is always contingent on a repertoire of states: receiving information is equivalent to the act of choosing one state out of all possible states, which is why it is measured as a before-and-after difference in probabilities when a signal is received. The signal’s arrival is equivalent to a measurable decrease in the uncertainty of the state, which the observed system actually exhibits. The system is therefore treated as an element of a finite and countable set of all possible states. For the linguist, there is nothing simpler than determining this set’s boundaries, because its elements are language utterances, which are always finite, and the amount of information can be determined exactly using combinatorics based on the number of signs (letters of the alphabet) that appear in the communication. Simple combinatorial operations allow us to perform the measurement. Whether the words are written horizontally or vertically, on a plane or the surface of a sphere, or whether they are encoded into a material substrate such as paper or transmitted as phonemes through oscillations in the atmosphere, makes no difference. The amount of information in a printed sentence does not change when it is written with a stick in the sand or when the letters are shaped from clay, cast in metal, or carved into a stone. But in the case of the genetic code, we have no idea how to separate the information from its substrate. A set of sentences in English and a set of “chromosomal sentences” are not amenable to the same accounting with regard to information content, because the relation between the emitter (the information source) and the receptor (receiver) is drastically different in the two. We know what the set of English sentences is, but, although we can assign the specific units of the genetic code—DNA—to specific letters of the Latin alphabet, we don’t know what exactly constitutes “a set of chromosomal sentences.” We can distinguish between a bunch of letters and an English sentence but not between a random chain of “gene letters” and a “sentence” in the language of heredity. Yet our current inability to do so is not the only serious obstacle on the path to measuring information, for the reason that sentences in a natural language are never true systems in the physical sense. The systemic nature of a language sentence is not its physicality but its congruence with the rules of syntax, lexicography, and grammar. If we were to treat a printed sentence as a physical system and wanted to measure its thermodynamically understood information, we would find that that the amount of order that the print contributes to the total entropic balance of the page is so minuscule as to be practically zero. From the thermodynamic point of view, the page sprinkled with printer’s ink and the page with a meaningful text printed on it are informationally, that is, entropically, almost identical. The reason is that one bit of information corresponds to 10⁻¹⁶ thermodynamic units of entropy.5 The amount of order in a physical system such as a sheet of paper is practically unchanged when we print letters on it because the text may contain at most a few hundred bits of information, whereas the entropy of the sheet transformed into bits is astronomically large (hexillions or quintillions of bits). The thermodynamic-informational balance in any human-made object, for example, a digital machine with 100 million elements, which can contain at most 4 × 10⁹ bits corresponding to 4 × 10⁻⁷ thermodynamic units, is again insignificant. The situation changes when we consider a living system, whether it is a mature organism or just a germ cell, because the number of elements constituting such a system is on the order of 10¹²; therefore, the physical entropy of a cell already depends on the amount of information that it contains and that governs its behavior (e.g., the process of embryogenesis).

But insurmountable problems arise the moment we attempt to determine the amount of information in the chromosomal fiber because the simple statistical procedure that we employed on the sentence composed of letters cannot be applied to the “sentence” made of the units of DNA (deoxyribonucleic acid). The topological properties of the printed sentence do not affect its information content, but the topology of the chromosomal fiber does. This is why we cannot equate the DNA units, that is, the genetic alphabet, with a linguistic alphabet. We should realize that the ease of the procedure inherent to the informatic treatment of language texts is simply the consequence of omitting all the connections that the perceived sentence makes with the perceiver’s brain. The sentence turns into a system of commands (directives) and, at the same time, into a substrate of a complex decision-making procedure in the brain, just as the chromosomal fiber is simultaneously a system of commands addressed to the cytoplasm of the zygote and, along with the cytoplasm, a substrate of the complex decision-making procedure that is embryonic development. But sentences of a natural language are always considered completely separate from all the operations of their interpretative reception, while sentences in the language of genetics do not exhibit such autonomy. Although we believe that this difference is not fundamental but due to technical aspects of the communication, it is precisely these technical aspects that prevent us from using here the simple methods of statistics and probability theory, which suffice for information accounting in linguistics. Biologists who repeatedly attempted to determine the information balance in cells and organisms ended up with wildly scattered results, and, worse, with fundamental errors and misunderstandings in the concept of information, which often lost any physical, operationally justifiable meaning in the process.

For example, it appears that the amount of information in a living organism is the same or almost the same as when it is dead and that the amount of information in a zygote is smaller than in the mature organism that develops from it—which would justify the assertion that life goes “against” the entropic gradient and is therefore not subject to the laws of thermodynamics. These undoubtedly false claims are based on faulty reasoning that is in contradiction with physics. We simply cannot investigate the processes of life with the same strong limitations that are acceptable in the study of language sentences, which are isolates. What can be omitted in linguistics cannot be omitted in biology; doing so would lead to nonsense. When Shannon asked what he should call the fundamental quantity in his theory, von Neumann suggested entropy, not only because of the mathematically identical formulas but also because, as he mischievously quipped, nobody really knew what entropy was. When several years later Leon Brillouin was writing his book about information in science,6 he called entropy the measure of our knowledge about a physical system—not simply the measure of “disorder” in the system. This elicited protests and misunderstandings; many thought that Brillouin considered entropy, and therefore also information, a purely subjective measure, indicating what we know about the system rather than marking an objective feature of its state. So von Neumann was right.

The notion of entropy is derived from investigations of various states of a gas, particularly the perfect gas, and statistical mechanics; its magnitude expresses our knowledge about a system in the sense that the system, e.g., the gas in a container, can exist in one of the innumerably many states that we are unable to distinguish because we cannot determine the positions of all the gas molecules at the same time. Therefore, entropy relates to all these indistinguishable states taken together at a given moment; because they are all equally probable, from the “entropic point of view” they are equivalent to a single macroscopic state. Hence, entropy indeed relates to subjective knowledge but coupled with a rigorously objective state of the system; because when the system transitions into states that are progressively less probable and more ordered, its entropy decreases, and thus the number of different but equivalent configurations of gas molecules must decrease, even though we cannot measure it. Zero entropy, unattainable in principle (since uncertainty is unavoidable at the quantum level), would represent a state in which we know everything about the system (in a physical, i.e., spatial, and not metaphysical sense) because it has reached a completely determined, and what’s more, the only possible, configuration of molecules.

Chromosomes are highly ordered systems, since every particle in them must occupy a given position; their entropy is very small and their information content enormous, possibly approaching the maximum for a polymeric macromolecule. As we said, the order of the letters in a printed sentence in a natural language is thermodynamically almost zero compared with that of a “genetic sentence.” The communication efficiency of a natural language is a consequence of its “triggering” character because in the case of language articulations, the brain acts as an enormously powerful amplifier on the physical level: a sentence that has been uttered and heard and whose entropic content is thermodynamically extremely small sets off in the brain an avalanche of coordinated processes that enable the “understanding” of the minuscule portion of entropy that was transmitted. Thus the sentence acts as a cock or trigger that initiates this multicascade amplification also in terms of energy: the “understanding” of the sentence by the brain requires an amount of energy that is gigantic compared with the thermodynamic balance of the sentence itself, although still small in the absolute terms—the total power of the brain is ten to twenty watts. A “sentence” made of genes, however, is not merely a triggering device but an autoliberator or self-effector that initiates, organizes, and regulates the whole process of embryonic development, which would not be possible if the chromosomal fiber did not possess an extremely high degree of order at the start. Linguistic entropy is not thermodynamic entropy, because the systems studied by linguists are not physical: the embodiment of an utterance plays no part in the processes of language communication understood as information transmission. Sentences of a language are primers that trigger a highly ordered avalanche of brain processes, and the process of reception gradually acquires physical character as the physical aspects of the brain functioning cannot be ignored, in contrast with the physical aspects of the language articulation itself (i.e., whether it is printed, sculpted, engraved, etc.). This allows the use of simple, almost primitive measuring methods in linguistics, whereas similar measurements in genetics are enormously difficult. The reason is that the physical aspect of a genetic “sentence” can never be omitted in the determination of its informational content. That is also why the pure combinatorics as a part of logical analysis suffices in linguistics but is of no use in informationally understood genetics: the chemical, molecular, and quantum-topological aspects of a written sentence are irrelevant, but the chemical and quantum-topological aspects of a chromosome are the essential determinants of its order.

A sentence in a natural language is true if it is constructed according to the rules of lexicography, grammar, and syntax—though not necessarily of semantics, because the sentence “The safety pins spend the night unusually in the crater corkscrews” is linguistically correct although its meaning is questionable. A genetic “sentence” is “true” if it represents a system of directives aimed at achieving a certain final state of the system, which is a mature organism. This “sentence” cannot be syntactically correct but prognostically (or effectorily, that is, teleologically) incorrect since the syntax of the genetic code is embryogenesis. If a set of genes (DNA units) does not trigger embryogenesis, it is not considered a genome but merely a chain of DNA elements that is chemically possible but causatively, embryogenetically, barren. One could theoretically create all the combinatorially (and chemically) possible systems of DNA as a set of macromolecules with the size (length) of the real genomes found in nature. Their number would be on the order of 10³⁰⁰⁰, meaning that there are not enough electrons in the universe to embody this set. But its technical impossibility aside, such endeavor would be meaningless; it is just to illustrate that the “genetic building blocks” can be combined even when they do not function as “genetic building blocks” at all. The point is that a biologist-geneticist does not want to measure “all the information” contained in genes (in molecules, atomic ensembles, electron clouds, quantum-mechanical systems, etc.) but only the part that actualizes embryogenesis. He is therefore interested not in “all bits” but only in the “biobits,” that is, the regulatory quanta of embryogenesis. But these “biobits” have little to do with the entropy of the linguists or Shannon. This is the obstacle in the creation of a “generalized linguistics” whose special cases would be, on the one hand, all natural languages and, on the other, all genetic codes.

There were attempts to save the situation by altering the basic concepts of information theory, but all nonprobabilistic information theories share the disagreeable property of lacking an elegant, natural transition from the notion of information to the thermodynamic, physical notions of Shannon. At the same time, considering information to be an entity that is subjective in nature, an idea that some linguist-informaticists are favoring, is baseless: zygotes, embryogeneses, and transformations of genomes into organisms existed billions of years before the rise of human beings and their natural languages. It appears that the mathematical tools used to avert the crises are too simple (and theoretical biology cannot gain much from graph theory). I do not believe that pure mathematics is of any help, because it can never lead to the concept of a threshold of the minimum complexity of a system, which is essential to an understanding of the phenomena of life. So far our realization that the amount of information in an object is not a clearly defined function of its complexity is only intuitive, and attempts, such as those of Brillouin, to divide information into “free,” that is, lacking physical interpretation, and “bound,” that is, correctly reflecting the information content of an object in the physical sense, skirt the problem instead of solving it. Thus we only intuitively understand that a system capable of self-reproduction must exhibit a certain minimum complexity, below which it cannot function, regardless of its structure. (I touched on this matter, in a slightly different context, in an essay on value in biology, published in Studia filozoficzne. It is included in this book.)

Apart from the failures in the field of the theory of knowledge—with its program to unify the various natural sciences elevated to a cybernetic “metalevel”—cybernetics suffered other fiascos. Many people believe that the notion of Shannonian information has been left “unfinished,” hanging in the air, as it were, while we must strive for a synthesis unifying it with other physical concepts, much as, mutatis mutandis, the theory of relativity unified time and space into a four-dimensional continuum. I fear such a hope is fundamentally wrongheaded, because nothing can be achieved on this terrain in a simple, clear, and at the same time precise (i.e., quantifiable) manner. This does not mean that I am pessimistic about the future of cybernetics; I just do not think that any single terminological or conceptual invention or revolution will lead to a breakthrough that makes cybernetics an epistemological cornucopia and pays back with interest the debts that it incurred with its initial bold promises.

Yet other failures, more technological than theoretical, were the dashing of hopes for the construction of “an intelligence amplifier,”7 a translating machine, and a machine that would, finally, imitate (even if only on the level of language) a human being (Turing’s idea). I think we can attribute these disappointments to the difficulties, hidden at first, that the theory of automata or indeed all of computer technology had to face. Computer programmers encountered unexpected problems as the programs became more and more complicated. Neither the inadequacy of machine memory nor the uncertainty of general operational strategy (should computation be “parallel” or “serial”?) limited this field’s achievements as drastically as the issue of program construction, which must become a part of a physically interpreted theory of algorithms. But this goal is too far, and enormous difficulties tower on the path to it. To put it in succinct, and therefore apodictic and simplifying, terms, the early optimism of the cyberneticists was based on the view, usually not expressed explicitly, that intelligence could be automatized by replacing mental processes, such as the ability to search, with mindless procedures by inserting the appropriate algorithms into a program that performed a task. A search for the best chess-playing program is nothing but an attempt to construct—with the method of successive approximations, trials, and errors—a fully functional approximation of the chess algorithm, which has so far not been fully achieved by means of pure mathematics. Also programs capable of learning were supposed to appear, except that their education would be implemented by another algorithm. And what was the solution finder in Ashby’s intelligence amplifier if not an algorithm for filtering? The strategy has always been to compress the mechanism of any kind of reasoning into its minimum form: a recipe, as universal as possible, embedded into a structure of network connections. The underlying assumption was that much of the structure of the brain was redundant and could be omitted. But is not the reason for toning down our optimism staring us in the face? Because if any form of survival tactics in terrestrial environments were algorithmizable, evolution would have unfolded differently than it actually did. Bear in mind that evolution is quite an effective constructor, and the homeostatic products with which it populates terrestrial environments are very well adapted to their surroundings—within a given homeostatic plan. If any form of adaptive strategy algorithmization were possible, the set of those survival algorithms would form an attractor or sink in the phase space of evolution’s speciation: because wherever an algorithm is in place and functions optimally to solve a particular problem, nothing that would work “even better” can arise. The algorithm for adaptation would be embedded in the organism’s nervous system, and that would be the end of progress in the entire domain of life. Let us note that no species is “final” but each is a link in the creative chain of organizational solutions on the neural level that passes what is “better” homeostatically to the next link and also that a human being has such a large brain. These two facts support the thesis that it is impossible to reduce the survival heuristic to an algorithm, either as a mathematical formula or a trial-and-error method of successive approximations.8

The number of theoretical schemes proposed to represent the structure of the brain was unusual in the 1950s, but nobody tried to construct a minimum functional model based on them. How strange that this was the dream of the cyberneticists! Infatuated with the first step they had made on the path toward the imitation of thinking, they believed that thoughtless repeatable procedures could replace thought. They did not express it explicitly, but that was the goal they tried to reach—in vain. Were they wrong? Yes and no. Von Neumann, in his comparison of a brain and a digital machine, focused on differences in size and efficiency of the building blocks, because in the era of cathode tubes those differences were huge. Today, when we have monomolecular memory, transistors, neuristors, microminiaturized systems, and integrated circuits, the differences have disappeared—and yet we have come no closer to constructing “a brain substitute.” The differences in size and information-transmission efficiency of the building blocks turned out to be unimportant!

Evolution seems to be a lazy constructor. It squeezes out of its products all it can, stubbornly sticking to a model already developed and repeating it in every possible way. It allows a radical change only once in a hundred million years or less. My point: if a system considerably simpler than the human brain could handle the problems that the human being encounters in his ecological niche, then his neural network would be precisely that simpler system. The celebrated, promoted, and condemned “redundancy” of the brain is fiction. The brain is redundant only in that despite the irreplaceable loss of about a hundred thousand neurons every day, it works fine in old age, when people have only 60 to 70 percent of their initial neural power. It is redundant only in the sense that it can cope with the loss of its mass, not in the sense that it contains reserves that are never active but whose activation would significantly and immediately increase a person’s intelligence. The brain is constructed according to the same praxeological rule that we see at work throughout bioevolution. Its redundancy is an illusion that stems from our inability to comprehend how an organ that arose by natural selection to function in the time of primary anthropogenesis, in the Eolithic, and was adequate for the “cave dweller” level of problem-solving could be adequate also for the tasks of subsequent human history, from the construction of the pyramids and epicycles to the creation of the theory of relativity and computers—without a latent redundancy in the “cave” era. We are equally unable to comprehend how the mechanism that regulated the reproduction of microbes, amoebas, and trilobites could, basically in unchanged form and a billion years later, create dinosaurs, whales, Pliopithecus, and eventually human beings, since the genetic code arose just once, and its lexicography, syntax, and grammar are shared by everything that has ever lived on Earth. Thus the brain that was selected for “cave-type” tasks proved able to tackle tensor algebra and group theory. It has not changed at all; what has changed—culturally, not genetically—are the canons of its specific programming.

If the algorithmization of the epistemic heuristic were possible and intelligence could be automatized, systems whose complexity is that of the human brain have not found the way. Such algorithms, if they existed, could be actualized only by systems more complex than the human brain, and this is precisely why evolution has not realized them. In its progress, evolution always chooses and solves the problems that are easier. So it seems that either such systems cannot be constructed at all or they can be, but the cost of their construction, in terms of time and the amount of creative combinatorics, is higher than the investment that the evolutionary process spent on us. There exists a third possibility, that the structural-functional system that our brain inherited from the hominid species, is grossly incompatible with the blueprint for an automaton of algorithmic gnosis; in that case evolution would again be unable to construct such automaton since it realizes changes not in big jumps but through the slow accumulation of small alterations. Note that the last possibility is not very probable because in the terrestrial environment intelligence is a value that almost universally facilitates survival and therefore every species having it would be selectively privileged—yet only the hominid group formed intelligence, which means that this terrain cannot be reached by taking a shortcut or breaking through a barrier. Considering the number of all animal species equipped with a nervous system that ever lived on this planet in the last few hundred million years, it sure looks like a shortcut to intelligence, understood here as an automatizing simplification of its acquisition through repeatable algorithmic procedures, simply does not exist. Some engineers who constructed otherwise original and logically valuable models of logical networks believe that such a path “must” exist, but the failure of their search is an indication that it does not.

Obviously we are wise in hindsight; sixteen years ago, the matter was less clear. In particular it was not understood why the human brain at birth had so little in the way of preprogramming, why a person had to learn from scratch practically “everything,” even sensorimotor coordination. More hardwiring would have made adaptation to the world considerably more economical. At present we assume that the brain is negligibly preprogrammed precisely because too much genetic hardwiring would greatly diminish our chances to adapt and therefore survive. The reason is that making the brain is just one facet of the problem called the “acquisition of intelligence”—the other, separate and huge, is the appropriate programming of it. It is thus for good and strategic reasons that our brains are initially “underprogrammed,” and their enormous plasticity and potential, in all known cultures in history, gave way to specific actual realizations, reflecting the universal fact that a hasty automatization of epistemic procedures always does more harm than good. Our world appears to be a place where an acquisition of a closed set of directives for universal epistemic effectiveness is either completely impossible or is possible only after scaling a barrier that is higher than the one on the path leading to the brain of Homo sapiens. In this light it is easy to understand the failure of cybernetics to show that what evolution has accomplished in a complicated and arduous manner can be achieved in a relatively easy and straightforward way. Clearly, if a universal algorithm for constructing programs of gnosis, reflection, and heuristics—understood as games played with Nature—could be created at little cost, it would prove the nonadaptive redundancy of the human brain. But the brain turned out to be not only a device much more complex than the experts imagined but also, and more importantly, a device whose “redundancy” with respect to its functions is very little or perhaps nonexistent.

This conclusion, along with the observation about the almost zero preprogramming of the human brain, makes me pessimistic about the possibility of our discovering simple and robust procedures that can be copied and multiplied, perhaps derived from algorithmic set theory, for the self-programming of systems of the digital machine type (i.e., the technological realization of Turing’s universal automaton).9 The situation reminds me of Einstein’s comment that “raffiniert ist der Herrgott, aber boshaft ist Er nicht.”10 On the one hand, there is no doubt now that a device, so unexpectedly elementary as Turing’s machine, can execute any operation that an arbitrarily complex structure such as a brain or even superbrain can do, which suggests that “der Herrgot is nicht boshaft.” On the other hand, to do “everything” this simple machine must have programs that cannot be reduced to a single common denominator—an algorithm, which shows the “subtleness of the Lord,” who giveth with one hand and taketh away with the other: the effectory apparatus and the substrate for thinking is simple, but programming it is neither simple nor universal . . . Thus the failures of cybernetics, which promised to construct an intelligence amplifier, an imitation of a human being, or a translation automaton, truthfully reflect the situation, in that these failures are just its technological and engineering consequences.

It will not be out of place here to turn to the inventions that natural evolution can boast in contrast with our many spurious inventions achieved viribus unitis.11 Ashby’s information generator, Chomsky’s generative grammars, and my idea of “information breeding” all share a common feature: they generate diversity by broadly sketching theoretical and technological aspects of the respective creative action, but at the same time they omit or mischaracterize, in just a few general words and optimistic allusions, the related problem of the selector of this diversity. For what is the use of creating an abundance of articulations, concepts, theories, and structures when we do not know what takes the place of that part of the mind whose function is to sift through the alternative possibilities? What is the use of creating abundance when we have no idea how to find in it the tiny, microscopic fraction of structures that have value—meaningful sentences in the case of linguistics, rational thoughts in the case of the “intelligence amplifier,” or sensible theories in the case of my “information breeding”? In each case the easy part of the task is solved and the difficult part is flippantly tossed to others to deal with. Whereas natural evolution created not only a diversity generator, which is the “articulation field of genomes,” that is, the set of all genetic codes at the disposal of a population of all living individuals, but also a selector that preserves only what proved useful—the process of natural selection with a Markovian character.12 This amazingly efficient two-part mechanism is a nightmare for human constructors, because the part that decides what has succeeded and what has failed—the selection filter—requires, in its evolutionary issue, millions of years to fully manifest its creative potential. Unfortunately, this is a parameter that we can never adopt from the evolutionary version along with all the riches of its inventory. We might accelerate the selection process a millionfold by outsourcing the job to “luminal” digital machines that work at the speed of light, but no matter how promising this prospect may seem, we do not know if modeling evolution with the necessary degree of complexity can be realized. The path to the goal might lead through building a kind of “evolutionary ladder”—a hierarchy of automata and procedures—such that simpler programs would aid in assembling more complex ones until after many stages systems appear that can outperform bioevolution, and not only in speed. But this path is just a hypothesis that refers to a very distant future, from which we are separated by a space of many unknown discoveries and revelations, which will bring successes, yes, but also many disappointments. And should the ultimate success in this evolutionary competition with Nature turn out to be impossible, it will mean that Einstein erred, and “der Herrgott” is not only “raffiniert” but mighty malicious too.

2

The divergence between the expectations and accomplishments of cybernetics raises the following question: If we are building computers but cannot create functional simulators of the brain because computers are much easier to make than brains, then why did evolution, which always takes the easier path, choose the more difficult one? The answer is that we are building universal digital machines but not equally universal programs and we are fine with that, because we use computers to solve tasks that do not require their full operational autonomy. Evolution never faced this choice: its products, living systems, never surrendered their full operational autarky13 in exchange for narrow specialization—with three exceptions: when cooperation as a form of specialization led to the emergence of a loose aggregate of homeostatic units, that is, a colony of living organisms (corals, anthills); when cells lost their universality in the formation of multicellular organisms; and when evolution developed parasitism and symbiosis. Aside from these three cases, characterized by group survival strategies, organisms had to tackle the problem of constructing a central nervous system, which is an equivalent of a universal informatic machine, and, at the same time, the problem of creating programs for it, which is why the two problems never arose separately. For this reason, evolution developed mixed tactics and strategies in its products; every organism, morphologically and functionally a sovereign unit as a player engaged in the game of survival against Nature, must have a full autarky, since naturally it could count on no help from outside, especially in the area of information. In contrast, computers without people are helpless. So evolution’s task was from the start qualitatively different from the group of technologically conditioned tasks that have enabled us to build digital machines.

If a digital machine—or its biological and at the same time isomorphic equivalent—could have arisen in evolution and successfully coped with typically homeostatic tasks, there is no doubt that this would have happened during the billions of years of development in the biosphere. Evolution’s stepwise character stems from the fact that the advantages and disadvantages of changes that increase an organism’s complexity never balance exactly, that a system that grows and complicates its soma14 and brain acquires new powers but also new weaknesses—at the same time. Statistically, the advantages slightly exceed the disadvantages—otherwise the transition from simplicity to complexity would quickly come to a halt. A bacterium is not more complex than a modern universal digital machine, but it is the bacterium, not the machine, that will survive when placed in an arbitrary environment because the machine is not a sovereign homeostat. It is in this respect that the nature of the tasks in the evolutionary flow determined the direction of evolution’s construction efforts, which differ radically from those of ours in informatic technology. Computers’ lack of autonomy, their dependence on people, refutes the idea that they may take over in some kind of “revolution” and rule humanity in a kind of “computerocracy.” This idea is based on false historical analogies with the struggle for power, supremacy, or the circulation of elites, which do not apply to intellectronics.

Computers will not dominate us unless we allow them to do so. And there are two ways how it could happen: either intentionally, that is, through the construction of governing machines—but then the issue is sociologically and ethically trivial because it was the humans who decided to enthrone computers on the pinnacle of government—or unintentionally, when the system “people + computers” gradually acquires an unexpected and undesired dynamic characteristic. When we discuss the various possible futures, our imagination, conditioned by our history, has been limited to visions of the “intelligence amplifier,” the “homunculus,” the “electronic sage” or “electronic demon,” the latter considered especially suitable for the role of a tyrannical ruler. Such models are naive and unrealistic. This does not mean that the cooperation of people and informatic machines is without danger. But the danger is completely devoid of the “personality” element: if an intellectronic system rules over us, it will not rule as a kind of simulated person with specific character traits. However, this only makes the danger far greater than if computers acquired “personality.” Because if they did, at least one of the two sides—computers—would be acting with full awareness of what’s happening. When someone, as a domineering personality, fights for power, he surely knows what he is doing; he is acting with intention and according to a plan, ethical or not, that makes sense and has a purpose, which we might even be able to discern. But if a gradual accretion of informatic machines and memory banks creates a governmental, continental, and eventually planetary computer network, which is the direction of the current development, the system composed of people and the said network may acquire a dynamic trajectory that does not coincide with the interest of our civilization. More specifically, the system might begin to drift. A large and highly complex system has an innumerable number of rules; when we are creating it, we see the benefits but not the unintended consequences.

The present problem of technology, historically the first, is the split of the global instrumental potential into two. From the beginning of civilization until today, we had only one type of technology: directed at the production of energy or things or the transportation of goods and people. Such technologies served us directly. Their growth has led to the loss of the biosphere’s self-regulating equilibrium, owing to which the next wave of technologies will have the new, exclusive task of reestablishing and maintaining this fragile equilibrium. If the first-wave technologies served us directly, the second-wave will serve us only indirectly—working not to help us but to save the terrestrial environment as a whole.

A counterpart of the physical labor that technology performs—such as mining raw materials and then transporting and transforming them, shooting cosmic projectiles out of the well of gravity, heating the arctic and cooling the equatorial dwellings of human beings, and so on—is the informational labor, yet both are similar in that they lower the entropy in one place at the cost of increasing it in another. This type of transaction is imposed upon us by the nature of the world in which we live. We stand at the threshold of informatic technology, which besides gifts that are beneficial can give us also gifts that are Danaian.15 The socioeconomic symbiosis of people with machines may dissolve the boundary between who is the mover and who is the moved, between who rules and who is ruled. The very large and complex global network system that will arise must have a very complicated structure of its rules. We will build it gradually, with an eye to specific benefits that in practice can be discerned clearly and early on. But the system may have dynamic features that are hidden from us—because of their innate inaccessibility, not because of anyone’s perfidy—and they may imperceptibly push it into a civilizational drift. I repeat, this would not be due to the “cryptocratic” behavior of any computer. I also ignore here the danger, so often discussed, of the loss of privacy when all, even the most intimate data about each person are stored in a machine memory: if this danger becomes real, ways can be found to remedy that. The machine-human symbiosis will be marked by contributions from each “side”—shared participation in decision-making, governing, and controlling. Yet the system as a whole may acquire a dynamic that is not entirely accessible to either of the two sides, because no system can describe or control itself completely: this principle is inviolable. A system, through observation and generalization, can uncover particular laws of its own operation but never all of them. That can be accomplished only by a higher-level system that has control over the first system. But initiating such control would be, in informatics, equivalent to the division now occurring in the technologies of labor, that is, the creation of informatics “of the second order” that would not serve us directly but supervise the symbiosis of people and machines so that its equilibrium would not enter any undesired drift. This act, however, leads to a fatal regressus ad infinitum: it would require in turn a supervisor of a next order that will control the controller in answer to the inevitable question, Quis custodiet ipsos custodes?16

Solving this problem obviously belongs to the far future, yet it is worth mentioning, because it indicates that the divergence between initial human expectation and final realization is indeed a constant in our history. The image of an infinite series of “informatic mirrors” as a control pyramid suspended above the civilization of the future is surely strange, as is the process of “reflecting” or representing the entire global activity for the purpose of achieving the optimum control, but it is also an ironic evocation of our ancient beliefs, such as the myths of higher powers who know everything about every human’s life and to whom everything must bend a knee. If the infiniteness of the supervisory pyramid caricatures the role of God, the archangels, thrones, and the whole hierarchical rest of the celestial informatics, it does so unintentionally: such an infinite regress is impossible to realize, and therefore an informatic machine that would simulate God’s controlling omnipresence will never come to pass.

The above prompts the following reflections:

(1) Research communities that face significant innovation tend to divide into two opposite camps, both armed with the slogan “Everything or Nothing.” The adherents of cybernetics expected everything from it; the critics of cybernetics considered it epistemologically almost useless. The middle ground was much less common; I argued in its favor in my Summa Technologiae (1964), in the section “Doubts and Antinomies” of the chapter “Intellectronics.” Attitudes of critics like M. Taube were outright liquidatory.17 On the other hand, incurable optimists like to extend the deadlines by which “everything” will be accomplished—which is equally damaging for cybernetics—and some even falsify data. A book published in 1968 by Gallimard in the series “Idées,” Les ordinateurs—mythes et réalités by J. M. Font and J. C. Quiniou, completely ignores the difficulties that various sections of the cybernetic program encountered, and even claims that in the USSR a novel by Dickens had been translated from English to Russian by a machine so well that the translation was equal in quality to a literary translation done by a human being (which is simply not true—Soviet sources are silent about it). Some professional logicians expressed faith in the unlimited power of cybernetics after Wang programmed a digital machine to prove most of the theorems in the fundamental work of Russell and Whitehead, Principia Mathematica, in 8.5 minutes,18 a task that took human experts many years. Yet we still do not have a program that would enable an informal, inconsequential, friendly conversation with a machine. This incongruity will make sense when we realize that although a human hand cannot lift a weight that for a crane is insignificant, it can perform thousands of operations beyond the ability of any crane. Our brain is ill-suited for narrowly deductive procedures, since it was constructed differently—to be versatile and universal. Proficiency in a language is determined by many areas, which simultaneously supply selectors that create an articulation and whose number, in rare cases such as a “difficult” literary text, can be innumerable. In principle, a machine is able to translate a terse scientific article with full proficiency even today if the author knows how a computer translation program works and is willing to write the article with that in mind. But no one would expect a scientist to go to such additional trouble—it might be easier for him to learn a foreign language to a necessary level than to write in conformity with the machine’s translatory skills.

(2) Those opposed to cybernetics claim that a machine could equal a human being only if it were a living organism made in imitation, in other words a human being “created in a test tube” (e.g., the Dreyfus brothers).19 Those defending cybernetics argue that the construction of an intelligent machine has failed only because of reasons that are beyond cybernetics: the enormous cost, the lack of market demand, technological difficulties, and so on. Those better informed know that these arguments do not conform with reality; the difficulties are principial and follow from theory. As for market demand, there obviously exist powerful groups interested in equipping the military with “intelligence amplifiers”—so the barrier to cybernetics is not economic. The persistence of enthusiasts has resulted in chess programs that can defeat any human player, but this success is a consequence merely of improvements in technical parameters of information processing, not of breaking through the barrier of heuristics or jumping to a higher level of computers’ intellectual proficiency.

Both the apologists for and antagonists of cybernetics distort the reality. The optimists leaped at the possibility of bypassing all the steps that natural evolution took to construct us. They relied on the idea that the evolutionary process is equifinal with an algorithmic or heuristic procedure amenable to quick mechanization. But as the book Artificial Intelligence through Simulated Evolution (1966), written by L. J. Fogel, A. J. Owens, and M. J. Walsh, showed, in an amazing confluence of arguments with my Summa Technologiae (1964, the first edition), many experts understand the necessity of going back and assuming a different, broader goal, that of modeling bioevolution as the “preceptor” of causative action also in the domain of intelligence. According to those authors, the modeling of evolutionary processes is indeed the first prerequisite for the automation of intellect.

(3) Finally, let me say it clearly: in the entire area of our discussion, the dichotomy is false between believing that a machine can be equal to a human being and believing the opposite, that humankind has eternal supremacy over all its creations in the intellectual domain. The optimization of machine parameters due to technological progress will not automatically take computer engineers across the threshold beyond which intelligent machines can be built. Nor is it true that only a synthetically replicated human being can equal a natural member of Homo sapiens. The task has turned out to be several orders of magnitude more difficult than it seemed twenty years ago—but no one has proved that it is unsolvable. Machines today easily cope with tasks that are difficult or impossible for humans, yet humans solve tasks that are impossible for machines. The point is that so far, the evolutionary paths of artificial and natural intelligence have been evidently divergent. The pragmatists’ argument that for this reason we should limit ourselves to the exploitation of machines where they are effective, and only there, is practical and reasonable if the directive aims at the present, but it is dangerous if it implicitly implies that we must give up work that aims at the future and whose crowning achievement will be the automation of human creativity. Because no natural law prohibits it; only ignorance (a lack of knowledge) stands in the way. This task is of course beyond the scale and power of one generation, hence the psychologically understandable rush, followed by disillusion when the too-optimistic aspirations fail. But the task continues to challenge us, and that is why, sooner or later, it will be accomplished.