6. Data Hoards

The question, then, is this: do the life, thought, reproduction, and gender of embryos, clones, and trash—organic, inorganic, material, or informational—have anything to do with the commercial and governmental data mining and data hoarding that have occupied public attention over the past few years? Or, put more bluntly, does slime in fact intersect with surveillance? An intuitive answer to this question, of course, is no (and no). Although human cloning and data mining are both conventionally understood to pose extraordinary threats to liberal democratic ideals, it is difficult to see how addressing one might aid in addressing the other. Likewise, although the problem of what, precisely, constitutes waste—the stuff that can be thrown away—is an issue implicitly underlying the rhetoric surrounding both the embryonic materials and the data that are (or are not) supposed to be collected, researched, or observed, the similarities between the two seem to end there.

A hint, however, remains—in these similarities that are not similarities—that the nonhuman material and informational life, thought, reproduction, and gender that the previous chapters examined and situated historically might have more to say about data mining than they intuitively appear to. Slime and data mining evoke one another sufficiently well to provoke strangely identical language in newspapers such as the Guardian, and each is equally talented at evading the classic rhetorical traps of liberal democratic demonization: are they external or internal threats? Are they tools or are they actors in and of themselves? Given such similarities, one might even speculate that the only responsible way to address the democratic challenge that is the data hoard is to reframe it as a problem of nonhuman mass democracy—as a problem resolved via recourse to a politics of thought that takes slime as its norm.

One potential hazard of such a reframing is that the problems associated with surveillance will cease to seem problems at all—that the ire over rampant corporate and governmental data mining might be exposed as baseless or untenable. But this conclusion is not the only one available when rethinking mass surveillance in this way. In fact, a second potential result of introducing embryos, clones, and trash, on the one hand, to data hoarding, on the other, is that, having thought more carefully about what, narrowly, the problem with data hoarding is, more productive solutions to it may present themselves. Or, at the very least, it may be possible to reconfigure data mining as, itself, a democratic actor rather than as a threat to political engagement. This last result, it is true, will unquestionably dampen the ire. But, at the same time, it will simultaneously strengthen, rather than further assaulting, a preexisting, and quite functional, mass democracy—albeit a mass democracy that rests equally on life and thought. This alternative approach, therefore, is as much beneficial as it is menacing.

Indeed, to push these claims even further, one could insist that the problem of data mining or data hoarding is a perfect case study of a threat that can be neutralized by the alternative mass democracy sketched over the previous chapters. Data mining is an issue, for example, that appears, at least, to be a product of recent technological innovation. But, at the same time, it just as clearly has antecedents in earlier democratic processes. It is an issue that seems to obliterate any intuitive link between identity or dignity, on the one hand, and embodied subjectivity, on the other. But, once again, it is also an issue that, thus far, seems frustratingly protean and immune to the liberal, human-centered tools that ordinarily regulate potential threats to political subjectivity. The case study in data mining that appears here, after all—the NSA’s collection of metadata and the rhetoric surrounding it1—has become a hyperbolic, or perhaps just tragic-comic, example of how the legal doctrine and political theory of human-centered liberal democracy are completely unequipped to tackle the intuitive concerns that citizens have about this flow, and storage, of information and material.

As will become apparent, however, approaching data mining from the perspective of nonhuman democracy, as a problem of reproduction-as-thought, and thus as a problem, in particular, of gender, can help to redefine the process as a functional, nonthreatening aspect of political engagement. Excavating the gender operations underlying data mining as a system, accumulation, or assemblage, in other words, can redefine it as a key democratic activity rather than as an obstacle to democratic engagement. Moreover, rethinking mass surveillance in this way will also suggest to readers the possibility that problems such as data mining are problems of technological innovation only at their most superficial level. What these technologies are repeatedly demonstrating, indeed, is not that democracy is under attack by a new spate of antihuman technological monsters. On the contrary, they are showing that any democracy that cannot recognize the political potential of these technologies, not as tools but as actors—these technologies that have always, in some form or another, existed—is not, in any sense of the term, a proper democracy. These dysfunctions have always rested at the heart of human-centered political engagement—contemporary technological permutations simply make them more apparent.

The Threat

Data mining is ordinarily understood to be a threat to privacy. This threatened privacy, however, has little to do with the colloquial privacy that protects one individual from another individual’s interest in his or her affairs. Rather, the privacy that data mining or data hoarding attacks—especially to the extent that each is linked to government surveillance programs writ large—is the privacy underlying a citizen’s dignity and integrity. It is the privacy protected in the United States, for example, by the Fourth Amendment, and it is the privacy that serves as a framework for an array of conversations about topics as varied as abortion, bodily integrity, the appropriate use of search warrants, and the inviolability of domestic space.

That data mining is an issue of privacy, and that privacy lends itself equally well to discussions of domesticity and discussions of abortion—or reproduction—in the United States hints at its relevance to the problem that reproduction, broadly defined, poses to classic liberal democratic conventions. It is true, of course, that nearly every conversation about rights in even the most relentlessly communication- or human-centered state seems inevitably to end at reproduction—a situation that merely confirms the strength of Foucault’s original claim that modern liberal democracy, or modern democracy in any form, has always been biopolitical.2 But the relationship between data mining and reproduction—as well as between data mining and nonhuman reproduction-as-thought—is also more intricate than that.

Indeed, the tropes that appear in the traditional, human-centered, liberal discussion of data mining demonstrate that, even in this most conventional articulation, the problem with technological mass surveillance is, first, a problem of reproduction. Second, and almost equally important, it is a problem of reproductive trash—that is, a problem of things and information that are replicated, alive, and then (inappropriately) deemed waste, and (even more inappropriately) collected and viewed in Gabrys’s uncanny space of salvage. Third, it is a problem of systems rather than of bodies. And fourth—the hypothesis now underlying this chapter—it is a problem that can be resolved only by excavating the gender operations that underlie it. Even in the human-centered liberal writing on data mining—not to mention the NSA’s own internal interpretation of the policy—in other words, it becomes apparent that the tools that theories of nonhuman mass democracy make available to commentators are infinitely more effective, merely descriptively, than those that have so frequently failed human-centered democratic engagement. In short, the only privacy doctrine that resonates in an era of data hoarding is, arguably, a nonhuman privacy doctrine—a doctrine, moreover, that is strongly suggested even in the most mainstream legal writing on the topic.

In his early and influential 2004 study of surveillance at the turn of the twenty-first century, for example, Seth F. Kreimer wrote that recent, technologically sophisticated variations on surveillance were able, initially at least, to evade Fourth Amendment protections for two reasons. First, these surveillance practices were possible without any invasion of “physical spaces,” and, second, government agencies could legally request records from “private parties”—for example, “bank records from bankers, or telephone logs from telephone companies”—without “either probable cause or warrant.”3 Existing Fourth Amendment protections assumed, in other words, some variation on a purely physical—be it bodily or domestic—space that could be protected from violation, alongside a clear distinction between material that had been given away (or disposed of) to third parties and material that had been preserved. Information-based surveillance practices, Kreimer wrote, exploded both the fantasy of protected bodies inhabiting protected spaces and the fantasy of clear borders between what was waste and what was not.

Kreimer’s solution to this apparent weakness of Fourth Amendment protection in the face of technological surveillance was to shift focus away from bodies and subjects and toward information. More specifically, it was to suggest legal structures that might regulate the flow of data themselves, rather than laws that might shape the intent, ethics, or politics of the humans involved in that flow. In particular, Kreimer argued that “the challenge is to prevent misuse of data, and the more selectively data is disseminated, the less likely it will be to be misused.”4 Secure databases already block access to data—or maintain a clear, detailed series of interactions between specific users and specific information—in multifaceted and sophisticated ways; there is every reason to believe, therefore, that, say,

a domestic security database could be constructed that allows general access, for example, to a subject’s address, but access to her gun ownership records only to one group of analysts, and access to her attendance at political rallies only to another select group. Other technologies could prevent analysts from exporting data from their computers to any other computer not similarly authorized, allowing privacy classifications to “stick to” the data as it is shared.5

In short, therefore, the solution to the problem of data mining in Kreimer’s early analysis is not to try to halt the collection or storage of data. Rather, it is to try to regulate the replication and transmission of data. Information naturally pools into databases, government agencies will naturally be drawn to these databases, and the best way to respond to this problem is to keep the information pooled rather than flowing—to store it indefinitely.

Indeed, a second procedure that might reinforce such a regulatory structure, Kreimer argues, is the institution of automatic “audit trails of both queries and dissemination.”6 Such trails would, Kreimer hints, effectively make replication and dissemination into a linear process—and thus the seemingly mindless and dangerous nonlinear proliferation of copies would be transformed into a coherent, step-by-step narrative of sharing, replete with actors and actions leading back to an originating source. “‘Sticky’ data tags which lay down trails as the data is disseminated,” in particular, would transform what were identical copies into not-quite-copies—bits of information that, again, tell a specific story of movement, of human communication, and of message transmission.7 But this regulation would, Kreimer argues, be infinitely more effective in maintaining the privacy of liberal citizens than an attempt to halt the collection of data in the first place. As he writes near the end of the article, “In today’s environment, ex ante judicial control of surveillance is unlikely. One response lies in strengthening legal doctrines that exert ex post control against abuse of information obtained by surveillance.”8

Three years later, Christopher Slobogin explored the same seeming weaknesses of Fourth Amendment protections, now, though, in the context of a relatively lively scholarly conversation about how to address these problems. In a paper delivered at a conference on surveillance at the University of Chicago Law School in June 2007, Slobogin first described the gradual extension of data-mining operations, especially those aimed at American citizens, during the early twenty-first century.9 From there, he provided examples of the types of information collected by various government programs—information about Internet searches, information about political activity or activism, information derived from the intersection of government and private sector databases, and information in the form of metadata, among others.10 Finally, Slobogin framed his argument in the same broad question that motivated Kreimer’s work—namely, can and should the Fourth Amendment be brought into play to regulate or curb data mining.

Slobogin addresses this question from a slightly shifted perspective, however. Rather than assuming a uniform quality to all data or information—and then recommending various techniques to manage this monolithic pool—Slobogin instead argues that the primary work of Fourth Amendment legislation should be to distinguish among the different types of information that might be collected or distributed. “A careful look at data mining,” he writes, “suggests that many versions of it should be subject only to minimal regulation, while other versions ought to be subject to significant constitutionally-based restrictions, whether controlled solely by the government or reliant on private entities for information.”11 More specifically, he recommends that agencies be required to provide “the highest degree of justification” when the data they collect are “private in nature and sought in connection with investigation of a particular target.”12 “Impersonal or anonymized records” or information “sought in an effort to identify a perpetrator of a past or future event,” contrarily, would not require such a stringent policy of prior permission.13

The collection of metadata—in this case, the NSA’s accumulation of phone records, but not the content of phone calls, to “conduct ‘link analysis’”14—is thus both a particular threat to current Fourth Amendment protections and, according to Slobogin, in particular need of this data-cognizant approach. Drawing on the responses of participants to survey questions, Slobogin writes that citizens are “much more leery of this type of data mining, ranking it as more intrusive than an ID check, whether aimed at multiple record sets . . . or only one,” and that if “this finding accurately represents societal views . . . event-driven data mining of private records should occur only if reasonable suspicion exists.”15 Slobogin continues that the problem with regulating the collection of metadata in this way is that the huge scale of such data collection would make “‘individualized’ reasonable suspicion” difficult to demonstrate.16 The government might consequently, he argues, be required to “demonstrate ‘generalized’ or group-wide suspicion” when seeking to engage in this sort of analysis.17

Whereas Kreimer seeks to protect privacy by managing the replication, storage, and transmission of a uniform, preexisting pool of data, Slobogin, drawing on the intuitive responses of individual survey participants, seeks to protect this same privacy by managing the definition and quality of not-yet-collected data. Slobogin believes that a preexisting taxonomy of information—some “private in nature,” some “impersonal or anonymized,” some relevant to the “individual” and some relevant to “groups”—will help to regulate the capacity of government agencies to access information.18 Although the large-scale pattern recognition that the mining of metadata is supposed to facilitate frightens survey participants and evades any recourse to classic interpretations of “reasonable suspicion,” Kreimer believes that his proposal is nonetheless valid. Indeed, by defining metadata as simultaneously “private in nature” rather than “anonymized,” and “generalized” rather than “individualized”—by defining it, in other words, as specifically subject to a theory of privacy in the absence of the individual—agencies can still be held accountable for its collection.

Jack Balkin has also written extensively on the constitutional issues raised by surveillance in general and data mining in particular. Two of his many essays, however—the first published in 2008 and the second in 2011—can help in tracing the evolution of his thinking. In the first, “The Constitution in the National Surveillance State,” Balkin situates data mining within a more general history of the modern mass democratic state. What he calls “the National Surveillance State,”19 he writes, is a natural extension of the “Welfare State,” which “created a huge demand for data processing technologies to identify individuals.20 Given this relationship between the information processing that, many would argue, benefits citizens and the information processing that, many are arguing today, threatens citizens, Balkin implies that perhaps it is not intellectually all that useful to address data mining as an unquestioned menace to be managed, regulated, or blocked. Rather, he argues, political theorists need to ask what sort of policies—and, more broadly, what type of states—are making use of this information.

Whereas Kreimer focuses on the systems that might limit the transmission and copying of preexisting data pools, therefore, and whereas Slobogin focuses on the character of the data that will be collected, Balkin focuses on the type of state that will benefit from data that are not only already collected but also already in transmission. Contrasting what he describes as democratic and nondemocratic states, Balkin argues that whatever a democratic state does with data is legitimate—and that, in fact, a state’s illegitimate use of data is an excellent indicator that it is not democratic. There is, therefore, not a great deal that lawmakers can do in Balkin’s analysis to manage or regulate the work of data or the operation of information systems—but analyzing this work and operation can, importantly, help lawmakers to identify those aspects of governance that may not reflect the democratic ideals of whatever state is benefiting from them.

This seemingly circular argument is not as paralyzing as it might appear to be. Although it does make “a traditional system of warrants” less likely to be effective, for example, it also allows for simultaneous “prior” and “subsequent” analysis of data collection activities—and even potential, if incomplete, management of information.21 This ability to analyze the potential threats posed by data mining in a more fluid and holistic—if less doctrinally satisfying—way is indeed vital, Balkin suggests, “as surveillance practices shift from operations targeted at individual suspected persons to surveillance programs that do not begin with identified individuals and focus on matching and discovering patterns based on the analysis of large amounts of data and contact information.”22

Perhaps even more useful to theorizing the relationship between data mining and democracy writ large, though—and also an entry point into Balkin’s later, explicitly data-centric interpretation of surveillance practices—determining the legitimacy of data collection with reference to the type of government it serves can help to pinpoint key characteristics of highly technological democracies (and nondemocracies). It can help commentators to understand how contemporary democracy operates in an empirical and dynamic way. In a telling metaphor, for example, Balkin describes the nondemocratic users of data as “gluttons” and the democratic users of data as “gourmets.” He writes that

democratic information states are information gourmets and information philanthropists. Like gourmets they collect and collate only the information they need to ensure efficient government and national security. They do not keep tabs on citizens without justifiable reasons; they create a regular system of checks and procedures to avoid abuse. They stop collecting information when it is no longer needed and they discard information at regular intervals to protect privacy. When it is impossible or impractical to destroy information—for example, because it is stored redundantly in many different locations—democratic information states strictly regulate its subsequent use. If the information state is unable to forget, it is imperative that it be able to forgive.23

The eating, storage, and waste metaphors on which Balkin relies in order to make his point here are revealing, evocative not just of Parisi’s scholarship, but also Leuckart’s laboratory science. Although it is more polite, for example, to describe the eventual disposal or distribution of the information that has been eaten as “philanthropic,” it is perhaps more accurate, physiologically at least, to describe the traditional results of eating as “waste” or “defecation.” Regardless of whether eating takes the form of gluttonish or gourmet sampling—regardless of whether its ingestion is democratic or not—that is to say, the by-product (or product) is the same, and it ends up, most likely, in, say, Şakir’s highly productive sewer. This point is indeed brought home when Balkin writes that even democratic states may not be able to forget information, that sampling cannot be undone, and that, implicitly, disposal is thus simply another type of flow or storage.

Or, put differently, Balkin’s eating metaphor leads to an interpretation of data mining and democracy that very much highlights not just the redundancy of individual citizens to contemporary theories of privacy, but also the redundancy of bodies to the “bodily” function that is frequently at the basis of privacy doctrine. Whereas Kreimer’s work posits that privacy might best be protected by focusing on data rather than on people, and whereas Slobogin’s work suggests that privacy can easily operate without reference to individuals, Balkin’s early work on surveillance hints that when data mining becomes a means of determining the democratic quality (or lack thereof) of a given technological state, privacy is best understood as an issue of bodily function—eating, waste, reproduction—without bodies. That the normative category of the human might also disappear from privacy doctrine goes without saying—and it is the starting point of Balkin’s later work on data mining.

In his 2011 essay “Information Power,” for example, Balkin writes that it is not only useful, but imperative, to recognize the nonhuman operation of “globalized information networks”—networks that do incorporate people but that “are controlled by no one in particular.”24 Framing his discussion within three influential posthuman categories of analysis—drawing on what he calls “the memetic model, the Gaia model, and the proliferation of power model25—Balkin argues that if scholars address problems such as privacy and data mining from the “point of view” of nonhuman actants such as memes, the whole earth, or disciplinary power networks, they will appreciate not only the complexity of such problems but also why they continue to elude classic legal remedies predicated on the protection of individual (human) rights.26

Memes, for example—defined here as “bits of information that replicate themselves in human minds and in human created methods of information storage and retrieval”—upset liberal, individual rights-based methods of interpretation, Balkin writes, in large part because their existence not only allows for, but demands, the instrumentalization of humans.27 Memes—“like genes,” if only “metaphorical[ly]” so—survive via replication and reproduction, and humans are the “host,” “platform,” or “means to [that] end.”28 Moreover, memes even reconfigure the apparently fundamental liberal right to speech—that right which seems so unique to the rational, embodied human subject—into a simple mode of reproduction. As Balkin puts it,

All communication on the Internet occurs through copying, which is how memes reproduce. If cultural reproduction is a meme’s version of sex, then the Internet is just one big orgy, an endless informational bacchanal. The Internet copies information from everywhere and then transmits it in redundant copies to millions of places around the world. From a meme’s perspective, the Internet is not a great achievement of human liberty. It is the most powerful technology yet devised for memes to reproduce themselves in perpetuity. The glut of information produced by the Internet leads to increasingly powerful technologies of search and retrieval—like search engines—that become central to the network because they lower the costs of finding information. These new search and retrieval technologies, in turn, produce and propagate vast amounts of metadata—information about information—thus spewing ever more memes into the global information environment.29

Balkin, in other words, is now supplementing his eating metaphor from the 2008 article with a reproduction—and biological or genetic reproduction—metaphor. In each case, the point to be made is that the proliferation—and then mining, replication, or storage—of data is a legal problem or a problem of privacy that can be addressed only by recognizing its strange relevance to material, yet disembodied, “bodily” function, its simultaneity with eating and reproduction, or with eating as reproduction. Indeed, although Balkin insists on the metaphorical quality of this mode of interpretation, his reason for doing so is not that information operates separately from organic or inorganic life. Rather, it is that he does not want his readers to assume that memes are rational or goal oriented (like genes, presumably)—he does not want them to conclude that memes consciously seek to use humans as a means to an end. Rather, memes reproduce in this way—and the category of privacy is thereby reconfigured in this way—because memes are part of a living, if not conscious, system.30

That this system is also thoughtful—and, in fact, that reproductive life is thought according to Balkin’s analysis—becomes overt in the second framing device on which Balkin relies. The “Gaia model,” he writes, also assumes that humans are (rightly) tools rather than an end in themselves.31 Here, though, humans are not a platform for reproductive activity; rather, they are “information processing nodes in a developing nervous system”32 that might lead “ultimately to a ‘global brain.’”33 Indeed, the proliferation of data throughout these global networks is continually moving the world from a system of “relatively primitive forms of ecological feedback and information exchange to an ever more complex and sophisticated system of information flows and information potentials.”34 With specific reference to the problem of surveillance or, more pointedly, data mining, therefore, Balkin writes that it is a perhaps natural—if not beneficial—outgrowth of an “emerging world” in which “we are not necessarily the central characters.”35 We might interpret a situation in which all humans “are continually tracked, traced, and monitored” as a situation in which “the world,” as the key thinking actor, “is becoming increasingly ‘aware’ of what is happening within it.” As both things and data replicate, reproduce, get stored, and get moved, the global system increasingly lives as thinking.36

Balkin’s 2011 essay, then, is a refreshing take on the problem that data mining poses (or does not pose) to privacy. By pushing Kreimer’s and Slobogin’s emphasis on data, rather than people, to a logical conclusion, Balkin makes a compelling case that surveillance and data mining are not only badly served by purely human-centered, individual rights-based frameworks of inquiry, and not only infinitely more ethically complex than the simplistic work bemoaning the end of privacy makes them out to be, but also quite open to materialist analyses predicated on the coming together of disembodied, systemic life, reproduction, and thought.37 The chapter is an effective antidote to the humanist scholarship that is both empirically suspect and, it seems, missing the point of data collection.

At the same time, however, it ought to be emphasized that, as “antihumanist”—in the sense that it takes the instrumentalization of the human as a situation to be discussed and analyzed rather than immediately condemned—as Balkin’s essay is, it is also relentlessly human centered. Balkin’s concern in the essay is still how issues such as data mining will affect humans—especially when humans are resituated in these new antihumanist frameworks. Moreover, even as Balkin strips the negative moral value away from the instrumentalization of the human, he retains and implicitly celebrates the human as the most important tool in these data-centric systems. It is “our” thought, speech, and communication that facilitate the replication of the meme, it is “our” role as nodes that drives the world from a “primitive” ecological system toward a developed informational system, and it is “our” subject positions that are formed via disciplinary power relations.

Indeed, the central role played by the human in this reframing becomes particularly clear when examining the vocabulary—taking human thought as its touchstone—that Balkin mobilizes in describing his three posthuman models. Even as he reminds his readers not to assume that memes are, say, goal oriented, for example, he nonetheless asks us to accept that memes have a “point of view”—that there is some specific focus or way of thinking analogous to human focus or thinking. Likewise, even as he wants readers to consider the world as a thinking system, he repeatedly describes this thought as aware and self-conscious—the product, in fact, of a humanesque “brain.” And finally, the repeated references to “feedback”—that process of input whose analytical value Parisi has questioned in a different context as a necessarily human, rational process38—demonstrates the extent to which Balkin’s antihumanist writing on data and information is nonetheless still a product of human-centered logic.

The purpose of highlighting the perhaps unexpectedly human-centered quality of Balkin’s explicitly antihumanist take on information is not to insinuate that his work is therefore weak. Balkin’s primary interlocutors are law scholars, and it makes sense that he would adhere to the human-centered conventions of legal scholarship even while emphasizing the potential value of asking antihumanist questions. As noted before, Balkin’s work—operating in the same vein as Kreimer’s and Slobogin’s, even while pushing this writing to one of its logical conclusions—demonstrates the elegance with which data mining, surveillance, and privacy might be treated as problems of systems, networks, and environments rather than as problems of embodied, rational individuals. It makes clear that there is a place for nonhuman politics—not to mention nonhuman life, thought, and reproduction—even in the most classic of scholarly conversations.

Balkin’s refusal to bracket the human—even if momentarily—as the central figure, if not the central agent, master, or actor, in his discussion of data in general, and data mining more specifically, however, also leads him to set aside some important implications of shifting the conversation about information and privacy in this way. Just as Kreimer wants, via sticky data tags and audits, to transform field-wide informational replication or reproduction into linear, humanesque reproduction resting on the discrete, unitary transmission of messages, and just as Slobogin wants to domesticate data into a story of “reasonable suspicion,” Balkin wants to transform the thought of these informational, material, and environmental systems and fields into self-aware, conscious, rational human thought. Even as all make quite clear that data mining is a problem of data, and even as each in different ways highlights the centrality of reproduction and thought, therefore, each also brackets the potential solutions to the problem, such as it is, that data mining poses to democracy. Even as all contribute to a fascinating new theory of privacy without humans, without individual subjects, and without bodies—a theory of privacy as reproduction without people—each also chooses not to contextualize this theory within a reconfiguration of the empirical case study that gave rise to it.

After all, even though all of these analyses note that classic Fourth Amendment protections are insufficient in an era of data mining because the Fourth Amendment cannot cope with the movement of informational trash (the movement, for example, of data provided to third parties—which is thus, as Fuller and Goffey write, both thrown away and vital), none proposes an alternative approach to such waste. Kreimer and Slobogin continue to insist that it is possible to protect vital information from becoming waste either by managing its dissemination (in the case of Kreimer) or by regulating its collection (in the case of Slobogin). Balkin replaces what is perhaps most properly called “waste” with “philanthropy” or “awareness.” No one, it seems, is willing to address reproductive, or replicated, waste as, in fact, garbage.

If, though, the human does disappear from this work that, again, already highlights the nonhuman, disembodied, and environmental reproduction, thought, and life at the center of data mining, the democratic potential of the trash produced by it becomes suddenly apparent. Indeed, each of these discussions serves as an excellent departure point for moving away from solutions that simply rehash—albeit in radical ways—preexisting, human-centered policies that seek to manage the collection and transmission of messages (a solution that, in any case, leaves aside contentless metadata altogether). Each makes it possible to look for solutions that take the operation of data seriously in its own right.

Moreover, and perhaps most pointedly, each also provides a framework for addressing the gender operations that underlie data collection—a framework that can help to transform this seemingly devastating assault on privacy rights and human-centered citizenship into a foundation for functional democratic engagement. The next section of this chapter draws directly on NSA documents concerning data mining in order to test the validity of these alternative approaches. After a detailed history of a number of key NSA data collection practices developed since the early 2000s, the section rereads these practices in light of the nonhuman gender operating throughout nonhuman, thought- and life-based democracy. In doing so, it provides a new and productive interpretation of data, surveillance, gender, and democracy.

The Resolution

When the NSA’s surveillance practices entered popular, policy, and media conversations in mid-2013, an issue that particularly concerned reporters and commentators was the collection of metadata—information about information—on a mass scale, ostensibly in aid of security-supporting pattern recognition. That the accumulation of metadata was especially suspect is intriguing given that such information has little to do with communication per se. When analyzing metadata, the content of telephonic or electronic messages is ignored in favor of email addresses, telephone numbers, contacts, and IP addresses. Nonetheless, as Slobogin writes, this type of undifferentiated, mass collection of contentless or message-less data frightens many observers even more than targeted surveillance of electronic or telephonic conversations does. It might be worth asking why.

One reason—as Slobogin also hints—that the mass collection of metadata is a source of worry in a way that other surveillance practices are less so is, once again, that it seems to elude Fourth Amendment privacy protections. Perhaps more pressing, though, a second reason that it is a source of concern is that the mining of metadata is necessarily without limits. Anything and everything—especially to the extent that it has no clear content—is potentially relevant to the patterns that emerge from the analysis of the links and contacts that facilitate systems. Even aside from these issues, however, there is arguably something else about the collection of metadata that has made it so worrisome to liberal democratic commentators. The mining of metadata conjures up the existence of a nonhuman, nonliberal mass democracy that threatens to overwhelm classic, human-centered democracy. This is a nonhuman democracy that is not only not threatened by data mining, but that functions through such informational operations—a democracy that appreciates the reproductive value of data and that, therefore, recognizes the centrality of gender to political engagement writ large.

A review of many of the NSA’s reports and memos concerning the value and potential pitfalls of working with metadata in fact suggests that what makes metadata attractive to security agencies is specifically these data’s gender—even as it is likewise the gender of this stored, replicating waste that worries Slobogin’s liberal democratic survey participants (as much as it did Minot and his scientific contemporaries). Metadata are functional rather than communicative—their role is to move, transfer, copy, or store messages across systems, accumulations, or fields rather than to be messages in and of themselves. Although they are functional and do work, however, they are, once again, by definition waste or trash according to classic interpretations of political engagement that understand democratic politics to happen as rational embodied subjects identify and recognize one another through meaningful dialogue.

Moreover, not only are metadata trash by virtue of lacking content, but by the time an NSA algorithm encounters them, these metadata have apparently finished their work, they have sent the message—and they are thus waste even in the operational sense of the term. The NSA algorithm, in short, is collecting metadata as, once again, specifically, trash—trash, however, that will then live, think, and reproduce when it is set loose in a new informational environment. Or, as Fuller and Goffey put it, “The seduction of data mining is that of finding exploitable patterns in vast quantities of data, modelling probabilities, predicting trends, anticipating next moves, extracting ‘truth from trash.’”39 What the documents addressing metadata make clear, therefore, is that as this revitalized trash is copied, replicated, and stored, it is, according to the NSA’s own understanding of its labor, engaging explicitly in the same series of gender operations that animated Buffon’s, Minot’s, and indeed the state of Louisiana’s biopolitical worlds.

In a 2007 memo requesting permission to extend the collection of metadata to email and telephone connections in the United States, for example, the NSA provides a number of clues as to the gendered work that this information performs. Noting that the “communications metadata” that had already been collected—“pursuant to the Foreign Intelligence Surveillance Act (FISA)”—now existed in “NSA databases,” the memo’s drafters argue that they are hampered from using the information as effectively as they might because their “present practice [is] to ‘stop’ when a chain hits a telephone number or address believed to be used by a United States person.”40 Unable, by law and custom, to modify their “contact chaining” algorithm to “chain through all telephone numbers and addresses, including those reasonably believed to be used by a United States person,” they write, they are thus unable to draw from their data “valuable foreign intelligence information primarily concerning non–United States persons outside the United States.”41 The drafters of the memo thus request that this obstacle to their analysis be removed.

To bolster their argument that the benefit of chaining through United States contacts outweighs the costs—and is in any case not a threat to the right to privacy42—the drafters continue that courts have “considered e-mails to be analogous to telephone calls and to letters sent through the postal system.”43 What this means, they posit, is that “the Fourth Amendment is not implicated when the Government gathers information that appears on mail covers” and thus in the contact sections of email messages.44 Or, more pointedly, “contact chaining and other metadata analysis” are not, they state, identical to the privacy-damaging “‘interception’ or ‘selection’ of communications” that the Fourth Amendment is supposed to regulate.45 As of 2007, therefore, NSA agents had permission to allow their algorithms to work through data collected inside the United States.

In a report titled “Bulk Collection Programs” sent to the Department of Justice two years later, in 2009,46 the NSA expanded on the nature and work of (as well as the ongoing challenges to) its metadata analysis. The drafters of this report described, once again, how they believed that the analysis of metadata remained in compliance with both the Fourth Amendment and FISA,47 but this time they also addressed weaknesses in their data collection programs that might (erroneously, they argued) appear to be violating the rights of U.S. citizens. Repeating that bulk data collection programs were not authorized to accumulate “the content of the calls or e-mail messages” that they targeted, the report goes on to state that the programs are also “subject to an extensive regime of internal checks.”48 Moreover, the drafters of the report continue, “Although the programs collect a large amount of information, the vast majority of that information is never reviewed by anyone in the government, because the information is not responsive to the limited queries that are authorized for intelligence purposes.”49

The problem, however, according to this report (and its 2011 reissue), was that “Department of Justice reviews” and “internal NSA oversight” had nonetheless discovered “a number of technical compliance problems and human implementation errors” in the execution of these programs.50 In particular, “the automated tools” that perform the majority of the analysis sometimes “operat[ed] in a manner that was not completely consistent with the specific terms of the Court’s orders.”51 The NSA thus created “a new position, the Director of Compliance, to help ensure the integrity of future collection.”52 Once again, the collection of metadata on a large scale continued without internal government opposition.

At the same time, a second 2009 statement that the NSA made to the Foreign Intelligence Surveillance Court (FISC) went into more detail about the procedures implemented to prevent misuse of the system—and in particular to prevent “automated processes and tools from querying the BR [bulk records] metadata inappropriately.”53 One of the most effective obstacles to such inappropriate operation, the statement reads, was the introduction of “a software restrictive measure” called “Emphatic Access Restriction (EAR)” that blocks tools from accessing metadata “with anything but a RAS [reasonable articulable suspicion]-approved identifier.”54 Moreover, whereas “the beta version and prior versions” of EAR “contained [a] feature that gave analysts contact information that normally is available only on an unauthorized fourth hop [i.e. a fourth link in a contact network/chain that can produce hundreds of thousands of new links]55 from a RAS-approved identifier,” the 2009 version “corrected to disable the feature for last-hop identifiers.”56 Here, then, although there is no contraction of the metadata collection programs, the algorithms that operate through them are becoming more sophisticated and more seemingly compliant with intuitive human-centered concerns about privacy.

Indeed, in a declaration included in the statement, Keith B. Alexander, the NSA director. stated further that although there have been several incidences of “non-compliance” or error, the NSA has worked to address each incident that has come to its attention. When it became clear, for example, that “all of the telephone identifiers” that had been added to “the alert list” were not “supported by facts giving rise a reasonable articulable suspicion,”—and that, indeed, “the majority of telephone identifiers included on the alert list had not been RAS approved”—the “Telephony Activity Detection Process was turned off,” and then only “restarted . . . without the use of metadata [thus incorrectly] obtained.”57 Likewise, whenever (human) analysts “inadvertently selected an incorrect option which put [a] domestic identifier in the large list of foreign identifiers,” they were subjected to “additional guidance and training” by oversight committees.58

Finally, system-relevant errors—for example, the gradual incorporation of “non-user specific numbers that [we]re deemed to be of little analytic value and that strain[ed] the system’s capacity and decrease[d] its performance”59—were dealt with via additional programming. NSA “engineers,” in particular, “developed a ‘defeat list’” that would remove such numbers from the database and that would serve as a receptacle for data incorrectly collected.60 A combination of algorithmic, human, and systemic malfunctions, then—all of which had to do with excessive information input of one type or another—became the platform for even more sophisticated programming and training.

Three years later, in 2012, an internal memo described the potential value of obtaining a different sort of metadata—namely, information collected from mobile phones that appear to be traveling alongside other mobile phones already targeted for investigation. The drafters of the memo limit their evaluation of “co-traveler” algorithms “to two or more locations within an analyst-specified time and space window.”61 They also note, however, that even within this limited framework of analysis, there is something of a disjuncture between data on movement and data on location. Or, as they put it, “Analytics that detect co-location may be different in nature from those that detect co-travel,” and therefore, “The specific analytic need will define which of these approaches is more appropriate and efficient.”62 Having determined hypothetically that analyzing cotravel is the best option, however, an analyst might benefit from the cotraveler algorithms that the memo is evaluating. In their most basic manifestation, these algorithms “compute the date, time, and network location of a mobile phone over a given time period,” and then they “look for other mobile phones that were seen in the same network locations around a one hour time window.”63

Algorithms of this sort, the memo continues, are already able to “chain ‘from,’ ‘through,’ or ‘to’ communications metadata fields without regard to the nationality or location of the communicants.”64 But, the drafters of the memo also note, there are still some obstacles to their work, and it could easily become more efficient. In particular, the memo’s drafters write, these algorithms would benefit from having access to “an index containing selectors whose tracks are near each other in space,” and they would also become more effective if they operated alongside a “GEOAddress hashing algorithm” that describes movement and placement via “LAT/LONG information.”65 Indeed, a running theme throughout the memo is the potential benefits that might accrue from mapping the patterns that emerge via the analysis of communications metadata onto the patterns that emerge via the analysis of spatial or locational metadata—and then in distilling these patterns to a small collection of data points. Even sophisticated algorithms that identify cotravelers and work with “spatial chaining software [that] aggregates and presents the meeting data,”66 the drafters insist, could operate more effectively in an environment that does not differentiate between space and contact.67 Or, as the memo concludes, analysis of metadata of this sort demands an awareness of information that is not necessarily unique to “signals intelligence” collections—such as “the locations of highways and roads”—operating “on a variety of different source data formats,” and “exploit[ing] divergent data sources to develop more complete pictures of target travel behavior.”68

A final document, of a different genre, will round out this impressionistic survey of the texts surrounding the NSA’s data-mining practices in the early twenty-first century. In late 2013, following the publicizing of many of the NSA’s data collection and data storage techniques, U.S. president Barack Obama created a Review Group on Intelligence and Communications Technologies.69 This committee produced a document titled “Liberty and Security in a Changing World,” which, although not specifically aimed at a public audience, was nonetheless more media-friendly than the internal memos and reports that had been made public the previous summer. In addition to contextualizing the NSA’s data collection activities within a broader history of surveillance and privacy in the United States from the mid-1970s to 2013, the committee also suggested to the president ways of integrating ongoing data-collection techniques into what they argued were more appropriate interpretations of privacy, “liberty,” and “security.”70

Among the most pressing of these recommendations were reforms to the collection and storage of “bulk meta-data.”71 Metadata, the committee suggests, could no longer be stored by the government and should instead be “held privately for the government to query when necessary for national security purposes.”72 Private storage, the report continues, would allow the data to remain available should analysis become politically necessary, but it would also force government agencies to demonstrate need before they accessed data. At the same time, the government would “not be permitted [any longer] to collect and store mass, undigested, non-public personal information about US persons for the purpose of enabling future queries and data-mining for foreign intelligence purposes.”73 The committee explains the logic of this recommendation by noting that after five years, “bulk telephony meta-data” is already “purged automatically from the NSA’s systems on a rolling basis,” and “in 201174 NSA abandoned a similar meta-data program for Internet communications.”75

Finally, the committee recommends significantly limiting human contact with metadata,76 even while allowing their continued storage and processing—and while also suggesting that security agencies focus on collecting the content, via traditional warrants, of communications rather than on identifying emergent patterns in massive accumulations of anonymous metadata.77 Or, put differently, the report argues that the distinction between metadata and other information (valued for its content) is not significant enough to allow for less regulation of the former than the latter. Indeed, it may be necessary to discard the separate categories altogether, the report insists, to recognize that the analysis of metadata threatens privacy rights as much as—if not more than—the analysis of traditional information, and to ensure that government agents meet the same evidentiary requirements in securing metadata as they do when they request access to traditional information.78

This final, relatively public document seems very much at odds with the internal documents produced by NSA bureaucrats. Whereas the goal of the internal documents seemed for the most part to be to extend the scope of data collection and analysis, the goal of this final report is to limit its scope. Moreover, in order to make the case for wide-ranging pattern recognition across mass quantities of metadata, the NSA insisted that there was an overt distinction between such metadata and the content of communications—the former immune to Fourth Amendment protections and the latter very much subject to them. Contrarily, the president’s committee asks whether it might be more useful to set aside the traditional distinction between information that does work (for example the information in the address line of an email or on an envelope) and information that communicates messages or meaning. In doing so, the apparently specious immunity from Fourth Amendment protections that metadata seem to enjoy might be eliminated. And finally, of course, whereas the internal NSA documents find the analysis of massive amounts of metadata beneficial, the president’s committee fails to see its benefit, arguing that traditional, targeted surveillance of message content, drawing on a classic system of warrants, is more supportive of what security agencies are trying to do.

Despite these apparent differences, however, the NSA’s internal documents and the report of the president’s committee are identical in one important aspect. Both seek to limit the interaction between human agents, on the one hand, and the growing, replicating, reproducing fields of metadata that underlie government practice, on the other. Both are very much in favor of the continued storage of such data, both to some extent assume that the data will continue to be collected and analyzed regardless of any attempt to halt this collection, but both see their problem—albeit quite different problems—to arise from human input. Although the NSA documents mention both the automated tool gone out of control and the badly trained agent as obstacles to analysis within the rule of law, their proposed means of overcoming such obstacles are variations on curtailing the activities of the human agent while increasing the scope of the algorithm. Indeed, the NSA documents, taken together, all suggest in various ways that as long as the algorithm is sophisticated and extensive enough, and as long as the fields of data are effectively mapped onto one another, human input and human error might be eliminated altogether.

Once again, this interpretation of the problems and solutions inherent in data mining is nearly identical to the interpretation floated by the president’s committee. According to the committee, metadata should continue to exist, “undigested” (and hence, presumably, immune from accidental defecation), in inaccessible, operational, yet purely informational environments, while human NSA agents should limit their analysis to the human communications of specific human targets. Like the ever-growing nonhuman facilities that are simultaneously storage spaces for, and guardians of, Louisiana’s embryos, that is to say, the nonhuman environments of metadata envisioned by the president’s committee are environments that both maintain and protect, that both foster data and help them to flourish. The ideal of both the NSA’s internal documents and the report of the president’s committee, in short, is the continued maintenance of growing, replicating fields of data that are closed to human contact. In the report of the president’s committee this situation will, moreover, specifically bolster “liberty.”

But what sort of liberty derives from fields of replicating data untouched by human input? Given its centrality to what remains a relentlessly biopolitical system, it is a liberty that is democratic, nonhuman, and the product, once more, of gender operations. Consider, after all, the assumptions and the logic that structure both the NSA documents and the report of the president’s committee. In each, the givens are that the metadata are already there, that they will always be there, and that they will always grow. In each, the ideal situation, in turn, is one in which these data will remain without human contact. In each, the solution to the problem is to produce an environment in which an algorithm might work through these data specifically in the name of liberty. In each set of documents, in short, the algorithm becomes the key democratic actor—the process that makes liberty happen. Each assumes that liberty and democracy are a product, solely, of algorithmic function, of gendered replication, processing, and waste, rather than of human speech.

But each set of documents also identifies an obstacle that stands in the way of this algorithmic production of liberty and democracy. Namely, the algorithm is always on the verge of processing too much information. In some cases, the tool simply goes “out of control.” In others, the problem of excessive information takes the form of inappropriate data—data, for example, that are not RAS approved, data that are associated with U.S. citizens, or data that are unwieldy as well as unimportant and might overwhelm the system. In every case, in other words, what seems to halt the algorithm’s path toward liberty is its encounter with overlarge data fields. In every case, the obstacle is growth, excess, and waste.

And what is the solution to this problem posed by excessive data? In short, it is threefold: first, software such as EAR can block an algorithm from accessing data that are tagged in a particular way (as not RAS approved, domestic, or otherwise inappropriate); second, systems can incorporate defeat lists that stop an algorithm from processing altogether when it encounters particular fields of data; and third—if counterintuitively given the apparently limiting quality of the first two—the algorithm might be allowed to play out across seemingly unrelated fields of information (fields of spatial as well as communications data, for example) in order to make its output more coherent. These three solutions, the documents state, will facilitate the integrity of future data collection. If they are taken seriously, data collection will not disintegrate.

These three solutions taken together, however, are by no means limiting to the algorithm’s processing—and indeed, the directions in which they prompt the algorithm to move, or the alternative routes they prompt the algorithm to consider, suggest the deeper and more fundamentally thoughtful, mass democratic quality of algorithmic function. Blocking specifically tagged items and creating defeat lists, for example, are limiting only if your algorithm is purely linear. And they are useful in corralling the work of the algorithm only given a coherent, finite field of information. If, though, as each set of documents assumes, the limit of metadata is infinite—and the field (created by, say, a second or third “hop”) is linked but by no means linear—then creating a few, finite dead ends simply prompts the algorithm to diversify and move around them. These solutions thus produce—specifically as they eliminate coherent human input, as they advocate linear programming to block routes taken by nonlinear algorithms, and as they aim at liberty and integrity—environmental reproduction alongside informational disintegration. And, more to the point, these solutions advocate an environmental reproduction that explicitly makes liberty—alongside an informational disintegration that makes integrity.

These two sets of documents—ethically at odds as they may appear to be—therefore, theorize gender, reproduction, and democracy in nearly identical terms. For both sets, the thoughtful, reproductive algorithm is at the heart of democratic engagement. The reason, however, that the algorithm is capable of this political engagement is because liberty and integrity arise from the collection and storage of likewise reproductive and thoughtful informational trash—because liberty and integrity cannot happen in the absence of information and material that has been, first, gendered, and second, turned into waste. Moreover, each set of documents makes trashed information the centerpiece of its democratic theory in this way not because of what these wasted data might say about humans, but because they perform a particular set of nonhuman, asexual, replicating, reproductive gender operations.

This system through which the NSA’s algorithms operate, that is to say, is a “feminine” system in the same way that asexually reproductive organic systems in both historical and ongoing scientific, political, and policy literature—on everything from paramecia to clones—are feminine systems. It is a system that produces integrity via disintegration and whose reproduction is a type of nonlinear flourishing or growth of thought and matter. It is a system that cannot distinguish between reproduction and growth or between thought and life. Moreover, as matter and information are incorporated into, or eaten by, this system (if always “undigested”), they become, explicitly the stuff of liberty, security, and integrity. It is only because data collection rests on a series of specifically feminine and reproductive operations that it can be democratic in the way that it is. The fact that it is feminine in this way does not, however, mean that the rhetoric surrounding it is misogynist; within the theory of nonhuman mass democracy described over the previous chapters, it is indeed a system that eludes the classic, liberal democratic threat of the unruly, feminine imagination altogether.

Conclusion

The NSA’s data collection programs do not ordinarily find themselves the subject of gender analysis—even though these programs do seem to threaten one of the most gender-relevant sets of rights (that is “privacy” rights) that exist in contemporary democracies. The hint, though, of the operation of gender that emerges from the strange centrality of reproduction and replication to both conventional, liberal, human-centered privacy doctrine and the rhetoric surrounding highly technological, seemingly antidemocratic, surveillance practices becomes more than a hint when the scholarship and the documents are read together. Indeed, it becomes clear that excluding gender from conversations about surveillance and data mining is at best an irresponsible move. Data mining, after all—like cloning, like the disposal of reproductive trash, and like biological reproduction broadly defined—produces outright panic among the protectors of conventional liberal democratic engagement.79 It is one of the few threats to such engagement that seems, again like human cloning, to draw uniformly violent responses.

The bulk collection of metadata is also, however—and also like cloning—one of the few threats to democracy that seems embedded in a type of reproduction or replication that has little to do with embodied, rational human subjects. And it is thus worth asking why this specific practice that has become the representative example of surveillance gone out of control—the mass collection of metadata—is also one of the few surveillance practices that ignores actual communication or dialogue. Why is the most pernicious threat to liberal democratic engagement in the realm of surveillance, like the most pernicious threat to liberal democratic engagement in the realm of biological reproduction, the threat that ignores bodies communicating with other bodies? Why is the threat that does not actually relate to human thought or human speech the most terrifying one?

One answer to these questions, again, is that data mining actually threatens to end only human-centered democracy. It suggests the triviality of human political engagement and, simultaneously, the vitality of a centuries-old nonhuman politics. Moreover, it threatens to normalize a mass democracy of life, reproduction, and thought that makes a centerpiece of gender analysis. In the assumptions underlying, and in the logic framing, both internal and public documents concerning data mining, in fact, gender—or the operation of gender—is fundamental to nonhuman democratic engagement of this sort. It is only as algorithms transform the simultaneously political, informational, and material systems across which they work into feminine systems that “liberty” becomes possible. It is only as these systems reproduce asexually—as they create life alongside death, both product and by-product, both integrated linear chain and disintegrated reproductive environment—that democracy happens. In many ways, this reconfiguration of mass democracy is a vindication not just of recent feminist theories of nonhuman politics, but, more fundamentally, of Carole Pateman’s earlier, groundbreaking critique of the supposedly liberal social contract.

To conclude with a look at the scholarship that introduced this chapter: Kreimer, Slobogin, and Balkin all in different ways promote a theory of privacy in the absence of humans and the absence of bodies. A reading of the documents that deal with the surveillance practices that prompted this scholarship, however, makes clear that this nonhuman privacy is incoherent outside the frameworks of gender analysis. The privacy doctrine that data hoarding evokes—just like the broader democratic theory that it elaborates—is a privacy doctrine that protects not just thinking life, but the gender operations that make this thinking life political and democratic.