Preface to the Second Edition

When I wrote the first edition of Cybernetics some thirteen years ago, I did it under some serious handicaps which had the effect of piling up unfortunate typographical errors, together with a few errors of content. Now I believe the time has come to reconsider cybernetics, not merely as a program to be carried out at some period in the future, but as an existing science. I have therefore taken this opportunity to put the necessary corrections at the disposal of my readers and, at the same time, to present an amplification of the present status of the subject and of the new related modes of thought which have come into being since its first publication.

If a new scientific subject has real vitality, the center of interest in it must and should shift in the course of years. When I first wrote Cybernetics, the chief obstacles which I found in making my point were that the notions of statistical information and control theory were novel and perhaps even shocking to the established attitudes of the time. At present, they have become so familiar as a tool of the communication engineers and of the designers of automatic controls that the chief danger against which I must guard is that the book may seem trite and commonplace. The role of feedback both in engineering design and in biology has come to be well established. The role of information and the technique of measuring and transmitting information constitute a whole discipline for the engineer, for the physiologist, for the psychologist, and for the sociologist. The automata which the first edition of this book barely forecast have come into their own, and the related social dangers against which I warned, not only in this book, but also in its small popular companion The Human Use of Human Beings,1 have risen well above the horizon.

Thus it behooves the cyberneticist to move on to new fields and to transfer a large part of his attention to ideas which have arisen in the developments of the last decade. The simple linear feedbacks, the study of which was so important in awakening scientists to the role of cybernetic study, now are seen to be far less simple and far less linear than they appeared at first view. Indeed, in the early days of electric circuit theory, the mathematical resources for systematic treatment of circuit networks did not go beyond linear juxtapositions of resistances, capacities, and inductances. This meant that the entire subject could be adequately described in terms of the harmonic analysis of the messages transmitted, and of the impedances, admittances, and voltage ratios of the circuits through which the messages were passed.

Long before the publication of Cybernetics, it came to be realized that the study of non-linear circuits (such as we find in many amplifiers, in voltage limiters, in rectifiers, and the like) did not fit easily into this frame. Nevertheless, for want of a better methodology, many attempts were made to extend the linear notions of the older electrical engineering well beyond the point where the newer types of apparatus could be naturally expressed in terms of these.

When I came to M.I.T. around 1920, the general mode of putting the questions concerning non-linear apparatus was to look for a direct extension of the notion of impedance which would cover linear as well as non-linear systems. The result was that the study of non-linear electrical engineering was getting into a state comparable with that of the last stages of the Ptolemaic system of astronomy, in which epicycle was piled on epicycle, correction upon correction, until a vast patchwork structure ultimately broke down under its own weight.

Just as the Copernican system arose out of the wreck of the overstrained Ptolemaic system, with a simple and natural heliocentric description of the motions of the heavenly bodies instead of the complicated and unperspicuous Ptolemaic geocentric system, so the study of non-linear structures and systems, whether electric or mechanical, whether natural or artificial, has heeded a fresh and independent point of commencement. I have tried to initiate a new approach in my book Nonlinear Problems in Random Theory.2 It turns out that the overwhelming importance of a trigonometric analysis in the treatment of linear phenomena does not persist when we come to consider non-linear phenomena. There is a clear-cut mathematical reason for this. Electrical circuit phenomena, like many other physical phenomena, are characterized by an invariance with respect to a shift of origin in time. A physical experiment which will have arrived at a certain stage by 2 o’clock if we started at noon, will have arrived at the same stage at 2:15 if we started at 12:15. Thus the laws of physics concern invariants of the translation group in time.

The trigonometric functions sin nt and cos nt show certain important invariants with respect to the same translation group. The general function

will go into the function

of the same form under the translation which we obtain by adding τ to t. As a consequence,

In other words, the families of functions

and

are invariant under translation.

Now there are other families of functions which are invariant under translations. If I consider the so-called random walk in which the movement of a particle under any time interval has a distribution dependent only on the length of that time interval and independent of everything that has happened up to its initiation, the ensemble of random walks will also go into itself under the time translation.

In other words, the mere translational invariance of the trigonometric curves is a property shared by other sets of functions as well.

The property which is characteristic of the trigonometric functions in addition to these invariants is that

so that these functions form an extremely simple linear set. It will be noted that this property concerns linearity; that is, that we can reduce all oscillations of a given frequency to a linear combination of two. It is this specific property which creates the value of harmonic analysis in the treatment of the linear properties of electric circuits. The functions

are characters of the translation group and yield a linear representation of this group.

When, however, we deal with combinations of functions other than addition with constant coefficients—when for example we multiply two functions by one another—the simple trigonometric functions no longer show this elementary group property. On the other hand, the random functions such as appear in the random walk do have certain properties very suitable for the discussion of their non-linear combinations.

It is scarcely desirable for me to go into the detail of this work here, for it is mathematically rather complicated, and it is covered in my book Nonlinear Problems in Random Theory. The material in that book has already been put to considerable use in the discussion of specific non-linear problems, but much remains to be done in carrying out the program laid down there. What it amounts to in practice is that an appropriate test input for the study of non-linear systems is rather of the character of the Brownian motion than a set of trigonometric functions. This Brownian motion function in the case of electric circuits can be generated physically by the shot effect. This shot effect is a phenomenon of irregularity in electrical currents which arises from the fact that such currents are carried not as a continuous stream of electricity but as a sequence of indivisible and equal electrons. Thus electric currents are subject to statistical irregularities which are themselves of a certain uniform character and which can be amplified up to the point at which they constitute an appreciable random noise.

As I shall show in Chapter IX, this theory of random noise can be put into practical use not merely for the analysis of electrical circuits and other non-linear processes but for their synthesis as well.3 The device which is used is the reduction of the output of a non-linear instrument with random input to a well-defined series of certain orthonormal functions which are closely related to the Hermite polynomials. The problem of the analysis of a non-linear circuit consists in the determination of the coefficients of these polynomials in certain parameters of the input by a process of averaging.

The description of this process is rather simple. In addition to the black box which represents an as yet unanalyzed non-linear system, I have certain bodies of known structure which I shall call white boxes representing the various terms in the expansion desired.4 I put the same random noise into the black box and into a given white box. The coefficient of the white box in the development of the black box is given as an average of the product of their outputs. While this average is taken over the entire ensemble of shot-effect inputs, there is a theorem which allows us to replace this average in all but a set of cases of probability by an average taken over time. To obtain this average, we need to have at our disposal a multiplying instrument by which we can get the product of the outputs of the black and the white box, as well as an averaging instrument, which we can base on the fact that the potential across a condenser is proportional to the quantity of electricity held in the condenser and hence to the time integral of the current flowing through it.

Not only is it possible to determine the coefficients of each white box constituting an additive part of the equivalent representation of the black box one by one, but it is also possible to determine these quantities simultaneously. It is even possible by the use of appropriate feedback devices to make each one of the white boxes automatically adjust itself to the level corresponding to its coefficient in the development of the black box. In this manner we are able to construct a multiple white box which, when it is properly connected to a black box and is subjected to the same random input, will automatically form itself into an operational equivalent of the black box even though its internal structure may be vastly different.

These operations of analysis, synthesis, and automatic self-adjustment of white boxes into the likeness of black boxes can be carried out by other methods which have been described by Professor Amar Bose5 and by Professor Gabor.6 In all of them there is a use of some process of working in, or learning, by choosing appropriate inputs for the black and white boxes and comparing them; and in many of these processes, including the method of Professor Gabor, multiplication devices play an important role. While there are many approaches to the problem of multiplying two functions electrically, this task is not technically easy. On the one hand, a good multiplier must work over a large range of amplitudes. On the other hand, it must be so nearly instantaneous in its operation that it will be accurate up to high frequencies. Gabor claims for his multiplier a frequency range running to about 1,000 cycles. In his inaugural dissertation for the chair of Professor of Electrical Engineering at the Imperial College of Science and Technology of the University of London, he does not state explicitly the amplitude range over which his method of multiplication is valid nor the degree of accuracy to be obtained. I am awaiting very eagerly an explicit statement of these properties so that we can give a good evaluation of the multiplier for use in other pieces of apparatus dependent on it.

All of these devices in which an apparatus assumes a specific structure or function on the basis of past experience lead to a very interesting new attitude both in engineering and in biology. In engineering, devices of similar character can be used not only to play games and perform other purposive acts but to do so with a continual improvement of performance on the basis of past experience. I shall discuss some of these possibilities in Chapter IX of this book. Biologically, we have at least an analogue to what is perhaps the central phenomenon of life. For heredity to be possible and for cells to multiply, it is necessary that the heredity-carrying components of a cell—the so-called genes—be able to construct other similar heredity-carrying structures in their own image. It is, therefore, very exciting for us to be in possession of a means by which engineering structures can produce other structures with a function similar to their own. I shall devote Chapter X to this, and in particular shall discuss how oscillating systems of a given frequency can reduce other oscillating systems to the same frequency.

It is often stated that the production of any specific kind of molecule in the image of existing ones has an analogy to the use of templates in engineering whereby we can use a functional element of a machine as the pattern on which another similar element is made. The image of the template is a static one, and there must be some process by which one gene molecule manufactures another. I give the tentative suggestion that frequencies, let us say the frequencies of molecular spectra, may be the pattern elements which carry the identity of biological substances; and the self-organization of genes may be a manifestation of the self-organization of frequencies which I shall discuss later.

I have already spoken of learning machines in a general way. I shall devote a chapter to a more detailed discussion of these machines and potentialities and some of the problems of their use. Here I wish to make a few comments of a general nature.

As will be seen in Chapter I, the notion of learning machines is as old as cybernetics itself. In the anti-aircraft predictors which I described, the linear characteristics of the predictor which is used at any given time depend on a long-time acquaintance with the statistics of the ensemble of time series which we desire to predict. While a knowledge of these characteristics can be worked out mathematically in accordance with the principles which I have given there, it is perfectly possible to devise a computer which will work up these statistics and develop the short-time characteristics of the predictor on the basis of an experience which is already observed by the same machine as is used for prediction and which is worked up automatically. This can go far beyond the purely linear predictor. In various papers by Kallianpur, Masani, Akutowicz, and myself,7 we have developed a theory of non-linear prediction which can at least conceivably be mechanized in a similar manner with the use of long-time observations to give the statistical basis for short-time prediction.

The theory of linear prediction and of non-linear prediction both involve some criteria of the goodness of fit of the prediction. The simplest criterion, although by no means the only usable one, is that of minimizing the mean square of the error. This is used in a particular form in connection with the functionals of the Brownian motion which I employ for the construction of non-linear apparatus, inasmuch as the various terms of my development have certain orthogonality properties. These ensure that the partial sum of a finite number of these terms is the best simulation of the apparatus to be imitated, which can be made by the employment of these terms if the mean square criterion of error is to be maintained. The work of Gabor also depends upon mean square criterion of error, but in a more general way, applicable to time series obtained by experience.

The notion of learning machines can be extended far beyond its employment for predictors, filters, and other similar apparatus. It is particularly important for the study and construction of machines which play a competitive game like checkers. Here the vital work has been done by Samuel8 and Watanabe9 at the laboratories of the International Business Machines Corporation. As in the case of filters and predictors, certain functions of the time series are developed in terms of which a much larger class of functions can be expanded. These functions can have numerical evaluations of the significant quantities on which the successful playing of a game depends. For example, they comprise the number of pieces on both sides, the total command of these pieces, their mobility, and so forth. At the beginning of the employment of the machine, these various considerations are given tentative weightings, and the machine chooses that admissible move for which the total weighting will have a maximum value. Up to this point, the machine has worked with a rigid program and has not been a learning machine.

However, at times the machine assumes a different task. It tries to expand that function which is 1 for won games, 0 for lost games, and perhaps for drawn games in terms of the various functions expressing the considerations of which the machine is able to take cognizance. In this way, it redetermines the weightings of these considerations so as to be able to play a more sophisticated game. I shall discuss some of the properties of these machines in Chapter IX, but here I must point out that they have been sufficiently successful for the machine to be able to defeat its programmer in from 10 to 20 hours of learning and working in. I also wish to mention in that chapter some of the work that has been done on similar machines devised for proving geometrical theorems and for simulating, to a limited extent, the logic of induction.

All of this work is a part of the theory and practice of the programming of programming, which has been extensively studied in the Electronic Systems Laboratory of the Massachusetts Institute of Technology. Here it has been found cut that unless some such learning device is employed, the programming of a rigidly patterned machine is itself a very difficult task and that there is an urgent need for devices to program this programming.

Now that the concept of learning machines is applicable to those machines which we have made ourselves, it is also relevant to those living machines which we call animals, so that we have the possibility of throwing a new light on biological cybernetics. Here I wish to single out, among a variety of current investigations, a book by the Stanley-Jones on the Kybernetics (notice the spelling) of living systems.10 In this book they devote a great deal of attention to those feedbacks which maintain the working level of the nervous system as well as those other feedbacks which respond to special stimuli. Since the combination of the level of the system with the particular responses is to a considerable extent multiplicative, it is also non-linear and involves considerations of the sort we have already brought out. This field of activity is very much alive at present, and I expect it to become much more alive in the near future.

The methods of memory machines and of machines that multiply themselves which I have so far given are largely, although not entirely, those which depend on apparatus of a high degree of specificity, or of what I may call blueprint apparatus. The physiological aspects of the same process must conform more to the peculiar techniques of living organisms in which blueprints are replaced by a less specific process, but one in which the system organizes itself. Chapter X of this book is devoted to a sample of a self-organizing process, namely, that by which narrow, highly specific frequencies are formed in brain waves. It is, therefore, largely the physiological counterpart of the previous chapter, in which I am discussing similar processes on more of a blueprint basis. This existence of sharp frequencies in brain waves and the theories which I gave to explain how they are originated, what they can do, and what medical use may be made of them represent in my mind an important and new break-through in physiology. Similar ideas can be used in many other places in physiology and can make a real contribution to the study of the fundamentals of life phenomena. In this field, what I am giving is more a program than work already achieved, but it is a program for which I have great hopes.

It has not been my intention, either in the first edition or in the present one, to make this book a compendium of all that has been done in cybernetics. Neither my interests nor my abilities lie that way. My intention is to express and to amplify my ideas on this subject, and to display some of the ideas and philosophical reflections which led me in the beginning to enter upon this field, and which have continued to interest me in its development. Thus it is an intensely personal book, devoting much space to those developments in which I myself have been interested, and relatively little to those in which I have not worked myself.

I have had valuable help from many quarters in revising this book. I must acknowledge in particular the cooperation of Miss Constance D. Boyd of The M.I.T. Press, Dr. Shikao Ikehara of the Tokyo Institute of Technology, Dr. Y. W. Lee of the Electrical Engineering Department of M.I.T., and Dr. Gordon Raisbeck of the Bell Telephone Laboratories. Also, in the writing down of my new chapters, and particularly in the computations of Chapter X, in which I have considered the case of self-organizing systems which manifest themselves in the study of the electroencephalogram, I wish to mention the aid which I received from my students, John C. Kotelly and Charles E. Robinson, and especially the contribution of Dr. John S. Barlow of the Massachusetts General Hospital. The indexing was done by James W. Davis.

Without the meticulous care and devotion of all of these I would not have had either the courage or the accuracy to turn out a new and corrected edition.

NORBERT WIENER

Cambridge, Massachusetts, March, 1961