Brains do not secrete thought as the liver secretes bile … they compute thought the way electronic computers calculate numbers.—Warren McCulloch, “Recollections of the Many Sources of Cybernetics” (1974)
Biology is superficial, intelligence is artificial.—Grimes, “We Appreciate Power” (2020)
One Neural Network or Many?
In their influential piece “A Logical Calculus of the Ideas Immanent in Nervous Activity” (1943), Warren McCulloch and Walter Pitts—a neurophysiologist and a logician who were soon to become key figures in the nascent field of cybernetics—introduced a new model to represent knowledge acquisition as it takes place in both biological neurons and computer circuits. Through this model, which they would name “neural nets” a few publications later, they argued that “all that could be achieved in [psychology]” (p. 131) could be reproduced through systems of logical devices connected together according to “the ‘all-or-none’ law of nervous activity” (p. 117). While most of the work in psychology at the time had taken behavior as its main focus, McCulloch and Pitts instead posited that the mind could become the object of a new experimental science once theorized as the product of networks of interconnected neural units computing the functions embedded in their architecture.
Decades later, neural networks are no longer recognized as a model of the mind, but rather as an effective machine learning framework to extract patterns from large amounts of data. While early attempts to physically implement neural networks in analog computers led to limited results (e.g., Rosenblatt, 1957), many computer scientists in the late 1980s perceived in this model the possibility of overcoming the structural limits of serial computing machines via its parallel and so-called subsymbolic architecture (e.g., Fahlman & Hinton, 1987; McClelland, Rumelhart, & Hinton, 1986). Implemented in software environments, neural networks now provide computer scientists with a powerful operational framework to transpose a wide range of processes onto a computable substrate, enabling applications as diverse as object recognition (e.g., Krizhevsky, Sutskever, & Hinton, 2012), spatial orientation (e.g., Bojarski et al., 2016), and natural language processing (e.g., Hinton et al., 2012).
Between their initial conceptualization as a model of the mind and their subsequent reemergence in the field of machine learning, neural networks were mostly pushed to the fringe of the fields in which they were originally embraced. Whereas the first generation of cognitivists replaced neural mechanisms with information processes as their privileged objects of study for their work on the mind (e.g., Newell, Shaw, & Simon, 1958), many computer scientists who were first introduced to artificial intelligence (AI) by training neural networks disavowed this model for what they saw as its insurmountable linearity (e.g., Minsky & Papert, 1969). Among technical historians, scholars of scientific controversies, and computer scientists alike, the prevalent narrative bridging these two moments generally goes as follows: given the limited computational resources available at the time, neural networks could not achieve the results they were theoretically capable of (Nagy, 1991; Nilsson, 2009), leading competing AI models to become more attractive to corporate and military funders (Guice, 1998; Olazaran, 1996)—a reality which held true until breakthroughs in computing power allowed neural networks to fulfill most of the tasks their early thinkers had anticipated (Goodfellow, Bengio, & Courville, 2016).
This emphasis on neural networks’ initial conceptualization and later reemergence as machine learning models might indeed reframe these systems’ past and current shortcomings as temporary obstacles; yet, from a historical and epistemological perspective, this narrative seems to obscure the fundamental differences between McCulloch and Pitts’ original model and neural networks’ current manifestations. Upon closer inspection, these two models appear to have little in common besides their isomorphic structure—i.e., probabilistically connected, specialized units—and shared name: one was a theoretical model of the mind, the other is a functional framework for pattern recognition; one inaugurated the simultaneous development of cognitive science and artificial intelligence, the other now instantiates the irreducibility of one field to the other; one modelled knowledge acquisition as it takes place in both the brain and computer circuits, the other is tightly linked to the development of computing architectures optimized for parallel processing.
What traverses these distinctions, however, is an ambiguity regarding neural networks’ status as a theoretical or functional model—an ambiguity which, given McCulloch and Pitts’ experimental ambitions, might be as old as the concept itself. While McCulloch and Pitts developed a framework to model learning systems, current implementations of neural networks now provide a functional framework to identify and extract patterns for which no formal definition is available. In both cases, neural networks are conceived of as models of knowledge acquisition based on the operationalization of what lies beyond the limits of knowledge. From that perspective, the core property which unites the different systems encompassed by the concept of neural networks appears to be not so much their structural similarities, but rather their shared conceptualization of the unknown as something that can be contained and operationalized by networks of interconnected units.
Building upon these themes, this chapter is broadly concerned with what will be described as the adversarial epistemology underpinning the initial development, and subsequent reemergence and reformulation, of neural networks. By modelling how knowledge acquisition takes place across substrates (e.g., biological neurons, computer circuits, etc.), neural networks can be seen as both media artifacts and mediations of larger sets of discourses related to how the limits of knowledge are represented and understood in the fields where these systems are studied. While cybernetics envisioned a Manichean world in which science and the military strive toward the same ideals of command and control, today’s research in machine learning attempts to develop models with more comprehensive representations by studying neural networks’ vulnerabilities to a type of inputs called “adversarial examples.” By investigating these two historical moments alongside one another, this text will attempt to highlight the persistence of an adversarial epistemology whose emergence coincides with neural networks’ initial conceptualization and whose legacy continues to inform this model’s current forms. That way, it will argue that neural networks’ claim for knowledge is historically contingent on a larger techno-epistemic imaginary which naturalizes an understanding of knowledge as the product of sustained efforts to resist, counter, and overcome an assumed adversary.
The first two sections of this text will examine the co-constitutive relationship between neural networks’ experimental epistemology and cybernetics’ adversarial framework. To do so, they will situate the initial conceptualization of neural networks—which, for the sake of clarity, will now be referred to as the McCulloch-Pitts model when discussed in their original historical context—within the development of cybernetics as a unified field. In 1954, Norbert Wiener offered a teleological and theological reading of cybernetics by attributing the shortcomings of science to what he called the two “evils” of knowledge, i.e., the Manichean evil of deception and trickery and the Augustinian evil of chaos, randomness, and entropy. Through these two evils, Wiener reframed science as an adversarial endeavor against the limits of knowledge while providing two convenient concepts to mediate the intellectual landscape of not only knowledge and science, but also command and control. In many regards, the McCulloch-Pitts model offered a powerful experimental framework to apply cybernetics’ adversarial epistemology; after laying out the reformulation of knowledge fostered by the McCulloch-Pitts model, this text will situate this model at the intersection of Wiener’s two evils in order to account for the emergence of an adversarial epistemology conflating the limits of knowledge with the limits of control.
In the third section, this chapter will examine the persistence of this adversarial epistemology by discussing how the limits of knowledge are operationalized in the current literature on neural networks. Neural networks enable many of the defining functions of today’s artificial intelligence systems, which has led the blind spots, biases, and failures of these machine learning models to become the objects of study of an ever-growing literature. More specifically, neural networks’ vulnerability to adversarial examples—i.e., malicious alterations of inputs that cause neural networks to misclassify them—is now a core concern bridging research in machine learning and cybersecurity. While cybernetics’ Manichean framework implied externalized and identifiable opponents (e.g., enemy pilots, Cold Warriors, etc.), the current literature on neural networks frames adversarial examples as both a threat and a privileged tool to increase the accuracy of these systems’ learning model. Using adversarial examples as an entry point into the operationalization of the limits of neural networks’ epistemology, this text will investigate how today’s literature on the topic reframes the failures, biases, and errors produced by neural networks as constitutive of learning systems. In so doing, it will attempt to reframe neural networks as manifestations of a larger adversarial moment in which all errors, failures, and even critiques are conceived of as necessary steps—or necessary evils—in the development of machine learning models with comprehensive epistemologies.
The McCulloch-Pitts Model: A Physiological Study of Knowledge
The McCulloch-Pitts model, as first introduced in “A Logical Calculus,” can be roughly described as networks of interconnected neural units acting like logic gates. Each unit is deemed to be in a quantifiable state (excitatory or inhibitory) at all times, which modulates as a function of the sum of the unit’s inputs: if the unit’s inputs are above a certain threshold value, the unit will be in an excited state and produce a positive value in return; if they are not, the unit will be in an inhibitory state and produce a null or negative value. Based on a voltage input, each neuron can then either fire and pass an impulse along or not fire and inhibit further excitation.
As initially enunciated by Claude Shannon, however, what is being communicated from one unit to the other is more than a voltage value. In “A Symbolic Analysis of Relay and Switching Circuits” (1938), Shannon theorized that Boolean algebra—a subfield of algebra in which the value of all variables is reduced to true and false statements or, as they are usually denoted, 1s and 0s—could be applied to the design of circuits to develop a calculus that is “exactly analogous to the calculus of propositions used in the symbolic study of logic” (p. 471). When a unit fires, it does not so much communicate its value as its state, which can be interpreted into the calculus of propositions it represents. For instance, if a unit is equated with proposition X, that unit’s state can both express if X is true or false as well as influence whether the other propositions to which it is logically related are themselves true or false (Shannon, 1938, p. 475). Similarly, McCulloch and Pitts asserted that “the response of any neuron” could be described as “factually equivalent to a proposition which proposed its adequate stimulus” (1943, p. 117); connected together, they claimed, these neural units could then form complex networks capable of expressing any logical propositions independently from the actions, and potential failures, of individual neurons, leading them to conclude that the “physiological relations existing among nervous activities correspond … to relations among propositions” (1943, p. 117).
Despite their similarities, the McCulloch-Pitts model was far from a simple rearticulation of Shannon’s framework in neurological terms. By describing biological neurons and their electronic counterparts as both subject to “continuous changes in threshold related to electrical and chemical variables” (1943, p. 117), McCulloch and Pitts introduced the possibility for evolution and adaption within networks of binary devices. These thresholds referred to a value “determined by the neuron” that had to be exceeded “to initiate an impulse” (1943, p. 115). More importantly, these thresholds could change and regulate themselves depending on what was being communicated. “An inhibitory synapse,” they wrote, “does not absolutely prevent the firing of the neuron, but merely raises its threshold” (1943, p. 123), meaning that a given proposition could change the very conditions for other propositions to be true. In that sense, instead of simply expressing logical propositions, McCulloch and Pitts’ networks could alter and modify their structure in a way akin to learning—a phenomenon they described as the process “in which activities concurrent at some previous time have altered the net permanently” (1943, p. 117). Through the introduction of adjustable thresholds, McCulloch and Pitts thus not only instituted self-regulation as a property of networks of binary devices, but also reframed the mind as a distributed phenomenon emerging from the interactions among simple logic gates.
While Shannon articulated his theory of logical relationships in digital circuits with regard to its applications in electrical engineering, McCulloch and Pitts thought of their model as the cornerstone of a new science of the mind. By providing a quantitative framework through which the mind could be captured as a localizable object of study, the McCulloch-Pitts model instantiated a clear break from the types of behavioral and psychoanalytic research in vogue at the time. While the contribution of the McCulloch-Pitts model to the development of cognitive science has been discussed by many science studies scholars (e.g., Dupuy, 1994; Kay, 2001), cognitive scientists tend to emphasize neural networks’ deviation from the field’s founding models. In their influential critique of neural networks, for instance, Jerry Fodor and Zenon Pylyshyn (1988) differentiate the approach inaugurated by McCulloch and Pitts from the “classical models of the mind [which] were derived from the structure of Turing and Von Neumann machines” (p. 4). These classical models might not have been “committed to the details of these machines,” they specify, but were nevertheless reliant on the basic assumption “that the kind of computing that is relevant to understanding cognition involves operations of symbols” (1988, p. 4), thus establishing an analogical relationship between brains and computing machines.
In contrast, the McCulloch-Pitts model posited that no meaningful distinction between brains and computers could be established from the perspective of knowledge. Rather than conceiving of computers as facilitating the study of the brain, their model provided a shared framework for the study of both brains and computers. This idea is in many ways illustrated by McCulloch and Pitts’ limited interest in all questions pertaining to machines and computers per se. In “What Is a Number, that a Man May Know It, and a Man, that He May Know a Number?” (1960), for instance, McCulloch wrote that his investment “in all problems of communication in men and machines” was limited to the way they offered quantifiable manifestations of “the functional organization of the nervous system” in contexts where “knowledge was as necessary as it was insufficient” (p. 7). In that sense, what interested McCulloch in the study of computing was not so much its unique properties or even those relevant to brain mechanisms, but rather how, once studied alongside the mind, it could provide an experimental setting to reproduce and examine processes that could not be studied in the brain itself. For McCulloch, computing was thus a useful object of study insofar as it contributed to his larger project of “reduc[ing] epistemology to an experimental science” (1960, p. 7).
The McCulloch-Pitts model can then be understood as an instance of what Seb Franklin (2015) calls cybernetics’ deployment “of digitality as a logic that extends beyond the computing machine” (p. 47). “In this world,” McCulloch and John Pfeiffer (1949) wrote, “it seems best to handle even apparent continuities as some numbers of some little steps” (p. 368). By supporting an approach in which all processes were transposed into systems of “ultimate units,” McCulloch framed computers not as machines, but rather as a quantitative framework whose advances opened the way “to better understanding of the working of our brains” (1949, p. 368) via the discretization of their mechanisms. Yet, this digital logic was also transformed by its mobilization within McCulloch and Pitts’ experimental epistemology. As they articulated a mathematical model in which brain mechanisms and computing processes were projected onto a shared quantitative framework, McCulloch and Pitts inaugurated a model of knowledge acquisition that was both independent from any given substrate (e.g., biological neurons, computer circuits, etc.) and representative of the embodied qualities of the logical principles underpinning knowledge (e.g., neuron-like structure, weighted connections, etc.). By describing knowledge acquisition as a process that could be exhibited by any substrate capable of embodying certain logical principles, McCulloch and Pitts constituted the mind and computing as functional models for one another as well as reframed their model’s digital framework as a set of embodied principles. In that sense, they not only reconceived the study of the mind as an experimental science, but more fundamentally redefined it as an “inquiry into the physiological substrate of knowledge” itself (McCulloch, 1960, p. 7).
The “neural” dimension of the McCulloch-Pitts model did not then refer to any physical properties of the brain specifically; it indeed encompassed the brain and its structure, but it also comprised computer circuits and all mathematical models based on a binary logic. Rather, McCulloch and Pitts’ engagement with neurality was more directly concerned with how neural or neuron-like structures—here broadly conceived of as networks of densely interconnected binary devices—could exhibit and even produce knowledge. As they noted in “How We Know Universals” (Pitts & McCulloch, 1947), neural networks provided what McCulloch and Pitts thought was the ideal configuration “to classify information according to useful common characters” (p. 127). While linear models might be vulnerable to small perturbations in the inputs they process, neural networks’ distributed structure allowed them to “recognize figures in such a way as to produce the same output for every input belonging to the figure” (1947, p. 128). For McCulloch and Pitts (1943), the problem of knowledge was thus intimately linked to determining the type of network structures capable to withstand the stochastic character of the world by filtering out noise and selectively taking information in, leading them to conclude that, “with determination of the net, the unknowable object of knowledge, the ‘thing in itself,’ ceases to be unknowable” (p. 131).
Cybernetics, which Norbert Wiener (1948) would introduce a few years later as the field dedicated to the scientific study of “control and communication theory, whether in the machine or in the animal” (p. 11), shared many of the concerns and assumptions underpinning the McCulloch-Pitts model. For instance, in addition to catalyzing a similar displacement of historical categories such as the biological and the computational in favor of functional abstractions, cybernetics likewise conflated knowledge and order via a reformulation of information as “a temporal and local reversal of the normal direction of entropy” (Wiener, 1954, p. 25). Some scholars have attended to Wiener’s influence on the development of the McCulloch-Pitts model (Abraham, 2002; Halpern, 2012), whereas others have described how this model provided a blueprint for the type of knowledge valued by cyberneticists (Aizawa, 2012; Schlatter & Aizawa, 2008). Yet, in light of such shared epistemic values, it might be more appropriate to understand cybernetics and the McCulloch-Pitts model as manifestations of a larger epistemological moment in which science itself was reframed as an endeavor against uncertainty. In The Human Use of Human Beings (1954), for instance, Wiener described science as “play[ing] a game against [its] arch enemy, disorganization” (p. 34), thus proposing an epistemological framework in which knowledge and knowing are linked together against some adversarial force. To produce knowledge and maintain order, Wiener asserted, scientists engage with knowledge’s limits as an adversary that must be gradually conquered—but who or what is this evil? As Wiener put it: “Is this devil Manichean or Augustinian?” (p. 34). From there, Wiener proceeded by differentiating what he saw as the two opponents against which cybernetics was striving: the Augustinian evil of chaos, randomness, and entropy and the Manichean evil of deception and trickery.
Norbert Wiener’s Two Evils: Adversariality as an Epistemology
Throughout his oeuvre, Wiener repeatedly came back to the problem of evil; yet, as his preoccupations vis-à-vis cybernetics changed through time, so did his understanding of the nature of that evil. In Cybernetics (1948), Wiener first situated his new field within a Manichean historical moment, where the field’s new developments had opened up “unbounded possibilities for good and for evil” (p. 27). Later, in The Human Use of Human Beings (1954), he described cybernetics as a response to “this random element, this organic incompleteness … we may consider evil” (p. 11), which had been uncovered by Josiah Willard Gibbs’ statistical mechanics. Instead of displacing one another, however, these two definitions of evil persisted in his work and were eventually formalized into two distinct figures: a Manichean and an Augustinian evil.
Wiener’s Augustinian evil refers to “the passive resistance of nature” (1954, p. 36) to its capture as an object of knowledge. While Wiener celebrated science’s growing mastery over nature, he also emphasized the mutual irreducibility of nature and science—chaos, randomness, and entropy all destabilize the certainties science strives to acquire and thus function as manifestations of nature’s resistance to revealing itself. Yet, these manifestations do not point to any external or identifiable opponent. “In Augustinianism,” Wiener specified, “the black of the world is negative and is the mere absence of white” (p. 190). In line with St. Augustine’s own definition, Wiener described this evil as an adversarial force constitutive of the world insofar as it is the direct product of its incompleteness. By incompleteness, he referred to nature’s disregard for its own laws; while Gibbs and others had already displaced the neatly organized universe of Newtonian physics with a more chaotic one best modelled statistically, Wiener reframed this “recognition of a fundamental element of chance in the universe” into a manifestation of “an irrationality in the world” (p. 11). For him, if nature demonstrates entropic tendencies, it was not because it is opposed to order so much as it lacks order.
At the same time, however, Wiener (1954) refused to recognize this incompleteness or resistance as an inalienable feature of nature. In fact, it was nature’s incompleteness that, for him, allowed science to resist principles as fundamental as “the characteristic tendency of entropy … to increase” (p. 12). In that sense, the Augustinian evil “is not a power in itself,” he specified, “but the measure of our own weakness” (p. 35). Nature’s policy might be hard to decipher, but it can nevertheless be revealed, and “when we have uncovered it, we have in a certain sense exorcised it” (p. 35). Wiener thus also conceived of this evil as a sign of the incompleteness of science’s own tools and knowledge. This evil might manifest itself each time science is confronted by nature’s incompleteness, but it also points to the possibility of order being established via the fulfilment of science. As emphasized by Wiener, the Augustinian evil “plays a difficult game, but he may be defeated by our intelligence as thoroughly as by a sprinkle of holy water” (p. 35). In that sense, once nature’s incompleteness is overcome by science, it is assumed that its adversarial qualities will disappear altogether, bridging the progression of science to the restoration of order in nature. Wiener’s Augustinian evil can then be situated within a larger intellectual history in which science is defined by its capacity to impose order on nature. Referring to Francis Bacon’s vexation of nature, Wiener redefined knowledge as “something on which we can act [rather] than something which we can prove” (p. 193). It is on this point that Wiener’s Augustinian evil differs from that of St. Augustine; while the latter instituted disorder as a fundamental quality of the world, the former posited order as something to be established. In that sense, if Wiener indeed acknowledged some sort of resistance on behalf of nature, it was insofar that this resistance was assumed to eventually give way.
By situating science within an Augustinian framework opposing organization and chaos, Wiener established a regime of knowledge aimed at overcoming nature’s entropic propensities. While Wiener (1954) recognized “nature’s statistical tendency to disorder” (p. 28) at the level of the universe, he also emphasized the role of information in creating “islands of locally decreasing entropy” (p. 39). Human beings, for instance, not only “take in food, which generates energy,” but also “take in information” and “act on [the] information received” (p. 28) to ensure their survival. Machines, for Wiener, similarly “contribute to a local and temporary building up of information” (p. 31) by sharing living organisms’ “ability to make decisions” and produce “a local zone of organization in a world whose general tendency is to run down” (p. 34). Information, for Wiener, could then be understood as not only the opposite of entropy—i.e., negentropy—but also the ideal unit for a type of knowledge that participates “in a continuous stream of influences from the outer world and acts on the outer world” (p. 122). Knowledge, in the context of Wiener’s Augustinian framework, was thus reformulated into the production of a localized, cybernetically enforced order against chaos and disorganization.
In Augustinianism, order can indeed be established but is defined as much by what fits within the “islands of locally decreasing entropy” produced by information as by what lies outside these islands. In his autobiography, I Am a Mathematician (1964), Wiener later commented that, in the face of “the great torrent of disorganization,” “our main obligation is to establish arbitrary enclaves of order and system” (p. 324). By emphasizing the arbitrariness of such enclaves, Wiener highlighted not only the functional nature of cybernetics’ defining principles, but also how order is produced by exteriorizing the disorganization against which it is mobilized. McCulloch (1950), for his part, shared Wiener’s definition of information as “orderliness” (p. 193) yet understood the type of limits Wiener described in Augustinian terms as constitutive of learning systems. In “Why the Mind Is in the Head” (1950), McCulloch argued that “our knowledge of the world” is limited “by the law that information may not increase on going through brains, or computing machines” (p. 193). New connections might “set the stage for others yet to come” (p. 205), but the limits of learning systems’ capacity to process information remained for him the fundamental Augustinian limit against which learning took place.
By equating the limits of knowledge with what limited information networks can process, the McCulloch-Pitts model internalized the adversarial limits described by Wiener. Whereas Wiener conceived of knowledge’s Augustinian limits as a measure of science’s incompleteness, McCulloch framed such limits as making knowledge itself possible. For McCulloch (1950), all knowledge was the result of how networks are wired; as he wrote, “we can inherit only the general scheme of the structure of our brains. The rest must be left to chance. Chance includes experience which engenders learning” (p. 203). In this regard, McCulloch somehow anticipated some of the defining features of later theories of cybernetics by framing chance and randomness as key tools for the production of knowledge. As argued by Jeremy Walker and Melinda Cooper (2011), second-order cybernetics reframed disorder as a fundamental principle of organization by theorizing systems that could “internalize and neutralize all external challenges to their existence” (p. 157). Similarly, by arguing that the mind is in the head because “only there are hosts of possible connections to be formed” (1950, pp. 204–205), McCulloch described a model of knowledge in which all knowledge was produced by the creation of new connections mirroring the stochasticity of the inputs coming from outside. That way, while Wiener established a functional relationship between knowledge and its objects, McCulloch restored an idealized correspondence among them by framing knowledge’s Augustinian limits as internal to learning systems. From the perspective of Wiener’s Augustinian framework, McCulloch’s experimental epistemology thus instantiated a larger reformulation of knowledge from a functional endeavor against the stochasticity of the world to a type of order building off stochasticity.
Wiener’s Manichean evil, comparatively, breaks away from this larger intellectual history in which order and chaos, science and nature oppose one another. If Augustinianism refers to randomness, entropy, and, more generally, the incompleteness of nature, Manicheanism rather consists in a “positive malicious evil” (Wiener, 1954, p. 11). While nature cannot actively cover nor alter its structure, the Manichean evil can, and will, “keep his policy of confusion secret and … change it in order to keep us in the dark” (p. 35). The Manichean evil, in that sense, does not then so much refer to any given object of knowledge as to historically situated opponents against which one produces knowledge (e.g., the Soviet scientist, the Cold Warrior, the enemy spy, etc.). Knowledge, in that framework, is conceived of as a strategic advantage that must be gained in order to secure one’s victory against an active opponent who will reciprocally use “any trick of craftiness or dissimulation to obtain this victory” (p. 34).
Contrary to Augustinianism, Wiener’s conceptualization of a Manichean evil appears to be strongly anchored in the historical setting in which cybernetics took form. As argued by Lily Kay (2001), the development of cybernetics as “a new science of communication and control” had “enormous potential for industrial automation and military power” and was actively fueled by the escalation of Cold War tensions (p. 591). Similarly, McCulloch and Pitts’ reformulation of the mind as a system of “decisions and signals” bore great potential for military funders, she adds, as it opened up many new opportunities “for automated military technologies of the postwar era” (pp. 591–593). In the context of the Cold War, the Manichean limits of knowledge thus encompassed not only what was yet-to-be-known but also the actors against which science was mobilized. In Manicheanism, science is then defined not so much by its exclusion of nature as by its adversarial relationship with some identifiable and historically situated Other. In this framework, Wiener (1954) concluded, “the black of the world” does not refer to “the mere absence of white”; rather, “white and black belong to two opposed armies drawn up in line facing one another” (p. 190).
While the Augustinian evil broadly consists in the transposition of an epistemological framework into a set of situated practices, the Manichean evil refers to the transposition of the military-industrial complex’s influence on Cold War science into a full-on scientific epistemology. Many of the early thinkers of cybernetics were first introduced to engineering and other applied sciences via their contribution to the war efforts; for instance, building upon their work on servomechanisms and anti-aircraft turrets during World War II, Wiener and his colleagues proposed in “Behavior, Purpose and Teleology” (Rosenblueth, Wiener, & Bigelow, 1943) a new mode of representation in which human operators’ behavioral processes and machines’ mechanical responses were modelled into unified control systems. Yet, as Peter Galison (1994) points out, the larger implications of these new modes of representation and of the Manichean framework producing them went beyond their manifestations in the battlefields of WWII and beyond. For Galison, the key innovation of cybernetics did not so much consist in modelling humans and machines together in the context of “the Manichean field of science-assisted warfare” (p. 251), but in how it subsequently decontextualized and expanded these functional equivalences between humans and machines into “a philosophy of nature” (p. 233).
The Manichean framework might then refer to a specific historical moment, but also to a reformulation of historical categories in light of new needs and imperatives. As cybernetics and Cold War militarism developed alongside one another, the laboratory and the battlefield quickly emerged as interchangeable settings in terms of how knowledge was redefined as an adversarial endeavor. If humans and machines were suddenly folded together into unified systems, it was not because they were deemed ontologically equivalent, but rather because they could more easily be intervened on once theorized that way. After the war for instance, Wiener and his lifelong collaborator Arturo Rosenblueth claimed in “Purposeful and Non-Purposeful Behavior” (Rosenblueth & Wiener, 1950) that “the only fruitful methods for the study of human and animal behavior are the methods applicable to the behavior of mechanical objects as well” (p. 326). Rejecting the question of “whether machines are or can be like men,” they concluded that, “as objects of scientific enquiry, humans do not differ from machines” (p. 326). In the lab as in the battlefield, new classes and categories were thus established not as an attempt to counterbalance nature’s entropic propensities, but rather in accordance with what was deemed the most “fruitful” from a Manichean perspective.
Similarly, the McCulloch-Pitts model’s reformulation of the mind into an object of experimental research was indistinguishable from the constitution of brains and machines as functionally and epistemologically equivalent. As McCulloch continued working on the nervous system throughout the years, neural networks’ experimental framework proved especially conducive to the ideals of control associated with the Cold War’s Manichean atmosphere. In his physiological research on nerves for instance, McCulloch (1966) claimed that the inner workings of the mind could be best understood once modelled as a system of command and control. Later, in “The Reticular Formation Command and Control System” (Kilmer & McCulloch, 1969), McCulloch and William Kilmer further expanded on this idea by arguing that the basic computation of the nervous system affords an effective organization for both the design and control of intricate networked systems. In both cases, the physiological properties of the biological brain were overlooked in favor of an abstract account of neural activity as an optimal design for control. For McCulloch, to construct a system in neural or nervous terms thus not only provided an optimal configuration to exert control upon it, but also instituted the military ideals of command and control into fundamental principles for the organization of learning systems. In the context of McCulloch’s work, Wiener’s Manichean evil did not then so much take the form of historically situated opponents but of a reformulation of humans and machines into control systems.
Wiener’s Manichean and Augustinian evils might imply distinct practices and ideals of knowledge, but their cohabitation in McCulloch’s experimental epistemology hinted toward a larger adversarial framework encompassing both evils. By reframing any limits to knowledge as internal, structural limits and by reformulating humans and machines as control systems, the McCulloch-Pitts model not only dissolved any clear boundary between these two evils, but also instituted their adversarial qualities into fundamental principles of knowledge. In the closing section of The Human Use of Human Beings, Wiener (1954) accounted for this slippage between the two by asserting that “the Augustinian position has always been difficult to maintain” and “tends under the slightest perturbation to break down into a covert Manicheanism” (p. 191). While the Manichean evil might be intimately linked to the historical context of the Cold War, Wiener acknowledged that there were elements of Manicheanism in settings preceding that period; there was “a subtle emotional Manicheanism implicit in all crusades” (p. 190), Wiener wrote, which culminated in Marxism and fascism, two manifestations of a Manichean evil whose unprecedented scale has “grown up in an atmosphere of combat and conflict” (p. 192). In that sense, while the threat of Manicheanism might have a longer history than this evil’s Cold War era manifestations imply, its expansion into a scientific epistemology remained for Wiener a recent invention.
Wiener’s two evils do not then refer to distinct, epistemological frameworks as much as to the reformulation of an Augustinian intellectual history in the context of an historically situated Manichean moment. While Wiener recognized, and warned against, the slippage between these two evils, he also linked the advent of his new science to the growing proximity between them; Augustinianism might accommodate a certain indeterminacy—“an irrationality in the world,” as Wiener termed it—but its Manichean reformulation does not. Conversely, whereas Manicheanism depicts a world where competing groups, armies, or systems persist by being in opposition, its Augustinian reformulation challenges the possibility of meaningfully distinguishing such entities from one another. At the intersection of these two evils thus lies a displacement of historical categories in favor of a larger reformulation of how knowledge is produced. As Katherine Hayles (1999) points out, cybernetics’ constitution of “the human in terms of the machine” (p. 64) was key in bringing brains and computers together under a unified model of control systems. Similarly, as Claus Pias (2008) argues, cybernetics had to operate less like a discipline and more like “an epistemology … [that] becomes activated within disciplines” (p. 111) for philosophy, engineering, neurophysiology, and mathematics to be folded into a new science of control and communication.
The McCulloch-Pitts model can then be understood as an epistemologically situated set of practices informed by cybernetics’ dissolution of disciplinary boundaries and historical categories; yet, its experimental framework also appears to have transformed many of cybernetics’ principles by providing a functional abstraction to implement them in a whole new range of settings, including the study of the perception of forms (Pitts & McCulloch, 1947) and the clinical treatment of neurosis (McCulloch, 1949). By producing the type of experimental results that were expected in the Manichean setting of the Cold War, the McCulloch-Pitts model reconfigured some of the defining categories through which knowledge’s Augustinian limits were conceived and operationalized. That way, while cybernetics’ dissolution of historical categories already hinted toward a Manichean ideal of control, the McCulloch-Pitts model further expanded this shift by turning the military ideals of command and control into fundamental principles for the organization of systems. From that perspective, both the McCulloch-Pitts model and cybernetics emerge as manifestations of a shared adversarial epistemology in which all knowledge is reframed as a set of practices and ideals of control aimed at resisting, countering, and overcoming the limits of knowledge.
Deep Neural Networks: Operationalizing the Limits of Knowledge
The same year as McCulloch’s death, Marvin Minsky and Seymour Papert—the first of whom was introduced to the field of artificial intelligence (AI) by studying neural networks (Minsky, 1954)—published Perceptrons (1969), the first comprehensive critique of biologically inspired models like the one proposed by McCulloch and Pitts. For many historians of science (e.g., Edwards, 1996) and computer scientists (e.g., Goodfellow et al., 2016), Perceptrons constituted the first major backlash against neural networks and propelled their subsequent marginalization in the fields of psychology and computer science (e.g., Guice, 1998; Olazaran, 1996). It is there that most historical accounts generally abandon neural networks. In computer science and the history of science alike, neural networks are described as relegated to the footnotes of computer science from that point onward until advances in processing power catalyzed their reemergence in the late 1980s by enabling their large-scale implementation in software environments (e.g., Nagy, 1991; Nilsson, 2009).
This way of narrating the development of neural networks through the lens of these two moments—i.e., their fall from grace and subsequent reemergence—is fairly commonplace in most histories of computing and points to a larger habit of framing the shortcomings of machine learning as temporary obstacles. Yet, by focusing on neural networks’ manifestations in computer science, these accounts overlook the McCulloch-Pitts model’s second life in the then nascent field of systems theory. As the second generation of cyberneticists abandoned their field’s initial emphasis on “homeostatic and purposive behavior” (McCulloch, 1956, p. 147), neural networks became a recurrent framework to study complex systems’ autopoietic, self-generating processes. Among the main figures of second-order cybernetics, theoretical biologist Humberto Maturana was arguably the first to use the McCulloch-Pitts model to study systems’ structural differentiation from their environment. First introduced to neural networks while working with McCulloch and others on the neurophysiology of vision (Lettvin, Maturana, McCulloch, & Pitts, 1959), Maturana reinterpreted McCulloch and Pitts’ neural model as a privileged framework to represent the operations underpinning the ontogenesis of organisms (Maturana & Varela, 1972, pp. 122–123) as well as the structural coupling between organisms and their social domain (Maturana, 1978, pp. 48–50). Later, while investigating the applicability of Maturana’s autopoietic turn to social systems, sociologist Niklas Luhmann (1995) revisited McCulloch and Pitts’ “self-referential net of contacts” to illustrate how systems sustain themselves by “achieving position in relation to the environment” (pp. 197–198).
While overlooked by most historical accounts, this reformulation of the McCulloch-Pitts model into a framework for the study of adaptive systems not only bridges neural networks’ initial conceptualization as a model of the mind and later reemergence as machine learning models, but it also undermines the possibility of articulating a coherent history of neural networks from the perspective of these two moments alone. By the time University of Toronto, New York University, and Université de Montréal became recognized in the 1990s as key centers of a new AI renaissance due to their work on neural networks as effective tools to “build complex concepts out of simpler concepts” (Goodfellow et al., 2016, p. 5), many scholars working on autopoietic processes had already reframed neural networks as a paradigmatic model for self-organization. This transformation is key, for it anticipated many of the conceptual foundations of today’s machine learning literature. For instance, by reframing neural networks as effective models to study how complex systems maintain themselves against external perturbations, systems theorists preempted machine learning’s conceptualization of adversariality as a fundamental principle for the organization of learning systems.
In computer science, neural networks now refer to a type of machine learning model constituted of multiple layers of specialized neural units whose parameters are determined by the task each network is trained to fulfill (Goodfellow et al., 2016, pp. 5–6). For example, a neural network trained for image classification would be given millions of images of preestablished categories—“dog,” “cat,” “human,” etc.—and then tasked to extract representative patterns for each of them. As the network processes its training dataset, its neurons come to specialize in recognizing specific combinations of features and acquire weighted connection values based on the representativeness of the identified features for each category. Once trained, the network would use these acquired, internal representations to classify new input images within these categories.
In today’s context, neural networks can thus be understood as implementations in code of the original McCulloch-Pitts model with some notable additions: the weights of the connections are automatically adjusted by the networks based on available data (Rosenblatt, 1961), the networks include one or more hidden layers of neurons that perform most of the model’s computations (Rumelhart, Hinton, & Williams, 1986), and the models now often involve an emphasis on depth, i.e., an ever-growing number of layers, which allows the networks to perform increasingly abstract tasks (Hinton, Osindero, & Teh, 2006). From a physiological model of knowledge across substrates, neural networks have therefore evolved into a powerful model to operationalize a wide range of tasks by extracting patterns and generalized representations from large amounts of data. In that sense, while the McCulloch-Pitts model was initially conceived as an attempt to articulate a universal model of knowledge, today’s neural networks can rather be understood as a versatile framework to operationalize specific forms of knowledge adapted to the growing range of settings in which they are implemented.
Neural networks might have gone from a model of the mind to an epistemological framework based on pattern extraction, but their limits still appear to be framed in adversarial terms. In their influential piece “Intriguing Properties of Neural Networks” (Szegedy et al., 2014), for instance, computer scientist Christian Szegedy and his colleagues describe a puzzling discovery: state-of-the-art neural networks might be able to master incredibly complex tasks, but can also misclassify data—be they images, audio inputs, spatial cues, etc.—that are only marginally different from those that they adequately classify. While such localized failures might be hardly surprising for any probabilistic framework, Szegedy demonstrates that these perturbed inputs generally cause any networks trained to fulfill the same task (e.g., recognizing objects in bitmap images) to perform similar misclassifications, even if they are “trained with different hyperparameters or … on a different set of examples” (p. 2).
Now called “adversarial examples,” as Szegedy et al. termed them (p. 2), these accidental or voluntary alterations of inputs are known to introduce targeted perturbations that lead neural networks to misclassify the resulting data without hindering humans’ capacity to categorize them correctly. Since Szegedy’s original paper, a growing scholarship has emerged around these targeted alterations, which are now studied for how they “expose fundamental blind spots in our training algorithms” (Goodfellow, Shlens, & Szegedy, 2015, p. 1) or, in other words, the limits of neural networks’ epistemology. More specifically, neural networks’ vulnerability to adversarial examples has become a central object of research in two main subfields of computer science: cybersecurity and machine learning. With cybersecurity implying locatable attackers and machine learning statistically representing the realities in which it operates, these two bodies of work might a priori seem to mirror the divide between Manicheanism and Augustinianism; yet, the way they both displace any limits to neural networks’ epistemology by internal, technical limitations hints toward a third adversarial framework that supplements the dialectic between Wiener’s two evils.
The cybersecurity literature covers many types of attacks that involve inputs that could be characterized as adversarial (e.g., SQL injection, buffer overflow, etc.), but adversarial examples differ from such programs by the way they can target a system without having direct access to it. Many scholars have studied adversarial examples from the perspective of the risks they represent with regard to the implementation of neural networks in real-world settings. For instance, Alexey Kurakin and his colleagues (Kurakin, Goodfellow, & Bengio, 2017a) have demonstrated that adversarial examples encountered through video signals and other input channels are misclassified almost as systematically as those fed directly into the targeted machine learning model. While real-world applications of adversarial examples by malicious parties are yet to be documented, they conclude that adversarial examples provide hypothetical opponents with a privileged means to bypass traditional security measures and perform attacks directly against current implementations of neural networks.
While Wiener described two evils adapted to an era of postindustrial warfare, Kurakin situates adversarial examples within a new defense rhetoric in which these attacks are conceived of as measures of the targeted systems’ “robustness” (Kurakin, Goodfellow, & Bengio, 2017b, p. 10). Gesturing at potential future iterations of such attacks—“attacks using … physical objects,” “attacks performed without access to the model’s parameters,” etc. (Kurakin et al., 2017a, p. 10)—Kurakin frames adversarial examples as requiring a preemptive response from the machine learning models targeted by these attacks. In line with a growing number of scholars (Gu & Rigazio, 2015; Huang, Xu, Schuurmans, & Szepesvári, 2016), Kurakin advocates for the integration of adversarial examples within neural networks’ training datasets in order to not only make them “more robust to attack” (Kurakin et al., 2017b, p. 1) but also expand the limits of their learning model. Adversarial examples might then indeed imply an active opponent; yet, their effectiveness remains more intimately linked to the actual limits of neural networks’ epistemology. In that sense, by emphasizing the hypothetical malicious parties behind adversarial examples instead of the structural limits of neural networks’ epistemology, this literature assumes that these limits can always be pushed back as long as they are attributed to attackers. That way, the more explicit neural networks’ Manichean adversary is, the more understated their Augustinian limits become.
The cybersecurity branch of the literature on adversarial examples thus appears to reaffirm a distinction between Wiener’s two adversarial conceptualizations of the limits of knowledge but does so by misidentifying neural networks’ Augustinian limits for an active opponent. The machine learning literature on the topic, for its part, seems to further expand this conflation by rooting these systems’ claim for knowledge in their internalization of the limits of knowledge, thus dissolving the need for two distinct evils altogether. In machine learning, many have documented the great asymmetry between the proliferation of studies on the different types of adversarial examples and the comparatively slow progresses of those attempting to develop defenses for them (e.g., Carlini et al., 2019). However, instead of designing specific defense strategies for each form of attack, a growing number of researchers now use adversarial examples to acquire a better understanding of neural networks’ learning model. In “Explaining and Harnessing Adversarial Examples” (2015), for instance, Ian Goodfellow, Jonathan Shlens, and Christian Szegedy analyze the types of blind spots exploited by adversarial examples and conclude that neural networks’ vulnerability to these attacks can be best explained by hypothesizing that neural networks all share a similar form of “linear behavior in high-dimensional spaces” (p. 1).
In addition to advocating for the introduction of adversarial examples in neural networks’ training datasets, Goodfellow and his colleagues (2015, pp. 4–6) reframe adversarial examples as powerful debugging tools—neural networks might be profoundly opaque once they are trained, but adversarial examples can nevertheless be used to test where the generalizations they acquire are faulty (p. 7). In that sense, while Kurakin describes adversarial examples as a threat, Goodfellow et al. frame them as an opportunity to expand the limits of neural networks’ epistemology. As in cybersecurity, the machine learning literature might then conceive of adversarial examples as pointing toward internal vulnerabilities, but, unlike this other body of work, it also reframes these vulnerabilities as an opportunity to once again improve neural networks’ learning model. From that perspective, nothing can truly lie beyond neural networks’ epistemology. If a network’s failures are systemically framed as opportunities to improve it, failures and limitations themselves become constitutive of the system they destabilize. The cybersecurity literature on the topic might imply some assumed opponent, but the tactics that are used to counter this antagonist are the same as the ones used in machine learning: adversarial examples are added to neural networks’ datasets in order to improve their learning model and expand their epistemology. In the context of this adversarial epistemology, there is no difference between an adversarial example produced by an attacker or one designed by a researcher—in all these cases, these failures are framed as constitutive of neural networks by providing the opportunity to improve future iterations of these systems.
What appears to link both the cybersecurity and machine learning literatures is thus a shared assumption that failures, errors, and limitations are in fact key principles for the organization of learning systems. This assumption can be linked back to the adversarial logic of cybernetics but with a key difference. For Wiener (1954), the distinction between the Manichean evil and its Augustinian counterpart would “make itself apparent in the tactics to be used against them” (p. 34). In the case of adversarial examples, however, there is no distinction based on tactics. Be it the product of a malicious opponent or a manifestation of neural networks’ poor performance with low probability inputs, an adversarial example can only be countered by being introduced in a neural network’s training dataset. As long as neural networks’ failures are seen as temporary internal limits instead of external ones, no failure, misclassification, or blind spot can truly destabilize their epistemology; on the contrary, by being framed in adversarial terms, these failures appear as necessary milestones in the development of learning models with all-encompassing epistemologies. In that sense, whereas Wiener instituted a new science that was aimed at countering localizable threats (e.g., enemy regimes, disorganization, etc.), neural networks inaugurated a new epistemological framework in which the manifestations of the limits of these systems’ epistemology only further reaffirm their claim for knowledge.
Neural networks might have dissolved the need to distinguish between a Manichean and an Augustinian evil, but their internalization of the limits of knowledge nevertheless points toward a third adversarial framework—an internalized evil, as it will be named here—against which the production of knowledge is mobilized. This internalized evil can be broadly understood as a certitude that, given the right substrate or unit, all limits to knowledge can be internalized and thus made temporary. While McCulloch and Pitts conceived of that substrate as an experimental epistemology onto which all knowledge could be modelled, the current literature on neural networks posits that all perceptual or intellectual tasks can be reduced to sets of quantifiable and operationalizable patterns. In both cases, the limits that neural networks encounter are systematically reframed as temporary internal limits that can, and will, be invariably vanquished. From the perspective of this internalized evil, which traverses McCulloch’s work, second-order cybernetics, and current research in machine learning, all limits of knowledge then appear as internal to knowledge itself.
Adversarial examples might be especially well suited to illustrate this internalization of the limits of knowledge, but neural networks’ adversarial framework also encompasses the negative social results associated with these systems. Despite the growing body of work documenting how these systems perpetuate the biases and inequalities that underpin their social milieus (e.g., Richardson, Schultz, & Crawford, 2019; West, Whittaker, & Crawford, 2019), neural networks continue to be implemented in virtually all fields of human activity. As demonstrated by many scholars, these systems perpetuate the biases, assumptions, and inequalities that are implicit to the data on which they are trained by acting on their objects in accordance with them. Neural networks, for instance, systematize the gender and racial biases that underpin the contexts in which they are implemented (van Miltenburg, 2016), naturalize physiognomic tropes regarding the link between facial features and criminal intentions (Wu & Zhang, 2016), and reproduce essentialist readings of gender and sexuality (Wang & Kosinski, 2018). Yet, from the perspective of these systems’ adversarial epistemology, such negative social results are not so much the products of systemic conditions that need to be politically addressed, but simple engineering problems that can be resolved through better training datasets and more processing power. Instead of highlighting the limits of neural networks’ epistemology, such perturbations—be they adversarial examples or social failures—are reframed as constitutive principles of the same systems they would otherwise destabilize.
By blurring the boundary between the malicious and the accidental, the external and the internal, neural networks’ adversarial epistemology reframes these failures as calling for better adversarial training as opposed to restrictions on the application of these systems in sensitive contexts. That is not to say that neural networks do not also hold the potential to shed light on many of the biases, blind spots, assumptions, and systemic conditions that are implicit to their training. Machine learning makes painfully obvious many of the biases and structural conditions that underpin social relationships—yet, by dismissing such outcomes as blind spots, bugs, or glitches, researchers and technology providers misidentify these failures as internal to the systems they train, rather than being constitutive of the social relations that underpin the production and implementation of neural networks. In this regard, neural networks not only render the distinction between a Manichean and an Augustinian evil obsolete, but they also internalize all errors, limits, failures, or even critiques by reconceiving them as necessary evils in the development of better machine learning models, forcing any external challenge against these systems to inhabit the adversarial framework it aims to subvert.
Conclusion: From a Physiological to a Computable Model of Knowledge
Machine learning, both as a field and a technological ideal, is often depicted as intimately linked to a series of technical breakthroughs in processing and computing power that took place from the 1990s onward; yet, each of its models relies on a larger history of ideals and practices of knowledge that threaten to be misread or overlooked if considered in computational terms only. At this time, neural networks outperform other models on virtually all the tasks that define the field and have established many of the terms through which machine learning and its related disciplines are being studied. As hinted at throughout this text, however, neural networks can, and should, be conceived of as both an operational model and an epistemological framework; since Kurt Hornik’s demonstration that any functions can be approximated by “multi-output multilayer feedforward networks” (Hornik, Stinchcombe, & White, 1989, p. 363), the literature on neural networks has overwhelmingly equated this property—the so-called universal approximation theorem—with a capacity to reproduce, or at least model, any intelligent behavior. While this leap from approximating functions to reproducing intelligent behavior might seem ambitious, it is akin to the shift from computing logical propositions to modelling the mind that McCulloch and Pitts’ neural model enabled roughly fifty years earlier.
This chapter has attempted to not only challenge the dominant narrative that neural networks only required more computing power to reemerge and become widely adopted as machine learning models, but also to provide an overview of the adversarial epistemology underpinning these systems’ internalization of the limits of knowledge. The McCulloch-Pitts model inhabited a disciplinary landscape that is hardly reducible to computer science and from which computer science itself emerged. By the time machine learning was constituted as a field, neural networks had already been reframed into a functional model for the study of adaptive systems, thus providing a promising framework to operationalize a wide range of processes for which no formal explanation or description is available. Access to greater computing power might have allowed neural networks to produce the type of results that are now expected from learning systems, but it is first and foremost as an epistemological framework positing that all knowledge can be distilled into sets of computable units that this model should be conceived of—a framework that, this chapter argued, involves an adversarial understanding of all limits to knowledge as internal to knowledge itself. The adversarial nature of McCulloch’s experimental epistemology might have been (and still somehow remains) understated, but deep neural networks’ internalization of the limits of knowledge nevertheless appears to have brought cybernetics’ reformulation of knowledge as something “on which we can act” (Wiener, 1954, p. 193) to a fully operational level.
The adversarial epistemology in which these systems operate appears then to manifest itself through the slippages between theory and implementation, modelling and operationalization, and epistemology and experimentation that characterize the different forms, practices, and models generally associated with the term “neural networks.” In that sense, if the operations performed by neural networks can be defined as adversarial, it is not so much because they marginalize, antagonize, or exclude—which they of course do—but rather because they force all knowledge on which they intervene to inhabit their adversarial epistemology. By reframing any limit or social failure as a temporary technical problem, this adversarial epistemology thus not only enables a larger computational determinism that assumes all knowledge can be projected onto a computable substrate, but also equates the limits of that substrate with the limits of knowledge itself.