Introduction

Behind the Internet

This book represents the culmination of eight years of research into the work of commercial content moderation of the internet, the workers who do it, and the reasons behind why their work is both essential and, seemingly paradoxically, invisible. Commercial content moderators are professional people paid to screen content uploaded to the internet’s social media sites on behalf of the firms that solicit user participation. Their job is to evaluate and adjudicate online content generated by users and decide if it can stay up or must be deleted. They act quickly, often screening thousands of images, videos, or text postings a day. And, unlike the virtual community moderators of an earlier internet and of some prominent sites today, they typically have no special or visible status to speak of within the internet platform they moderate.1 Instead, a key to their activity is often to remain as discreet and undetectable as possible.

Content moderation of online social and information spaces is not new; people have been creating and enforcing rules of engagement in online social spaces since the inception of those spaces and throughout the past four decades. What is new, however, is the industrial-scale organized content moderation activities of professionals who are paid for their evaluative gatekeeping services, and who undertake the work they do on behalf of large-scale commercial entities: social media firms, news outlets, companies that have an online presence they would like to have managed, apps and dating tools, and so on. It is a phenomenon that has grown up at scale alongside the proliferation of social media, digital information seeking, and online connected social and other activity as a part of everyday life.

As a result of the incredible global scale, reach, and impact of mainstream social media platforms, these companies demand a workforce dispersed around the world, responding to their need for monitoring and brand protection around the clock, every single day. Once the scope of this activity became clear to me, along with the realization that talking about it would require coining a descriptive name for this work and the people who do it, I settled on the term “commercial content moderation” to reflect the new reality. I also use some other terms interchangeably to stand in for commercial content moderators, such as “moderators” or “mods,” “screeners,” or other, more generic terms; unless otherwise specifically noted, I am always talking about the professional people who do this work as a job and for a source of income. There are many terms to describe this type of work, and employers, as well as the moderators themselves, may use any one of them or others even more abstract.

Of course, commercial content moderators are not literally invisible; indeed, if anyone should seek them out, they will be there—in plush Silicon Valley tech headquarters, in sparse cube farms in warehouses or skyscrapers, in rural America or hyperurban Manila, working from home on a laptop in the Pacific Northwest while caring for kids—in places around the world. But the work they do, the conditions under which they do it, and for whose benefit are all largely imperceptible to the users of the platforms that pay for and rely upon this labor. In fact, this invisibility is by design.

The goal of this book is therefore to counter that invisibility and to put these workers and the work they do front of mind: to raise awareness about the fraught and difficult nature of such front-line online screening work, but also to give the rest of us the information we need to engage with more detail, nuance, and complexity in conversations about the impact of social media in our interpersonal, civic, and political lives. We cannot do the latter effectively if we do not know, as they say, how the sausage gets made.

The process of identifying and researching the phenomenon of commercial content moderation has connected me with numerous people in a variety of stages of life, from different socioeconomic classes, cultural backgrounds, and life experiences. It has necessitated travel to parts of the world previously unfamiliar to me, and has led me to study the work of scholars of the history, politics, and people of places like the Philippines, while also considering the daily realities of people located in rural Iowa. It has made connections for me between Silicon Valley and India, between Canada and Mexico, and among workers who may not have even recognized themselves as peers. It has also necessitated the formulation of theoretical framings and understandings that I use as a navigation tool. I hope to make those connections across time and space for them, and for all of us.

In the United States, using the internet as a means of communication and social connection can be traced to some of its earliest moments, such as when researchers at UCLA in 1969 attempted to transmit a message from one computer node to another connected through ARPANET—the internet’s precursor, funded by the United States Department of Defense (the result was a system crash).2 As the ARPANET evolved into the internet over the next three decades, important and experimental new social spaces grew up as part of the technological innovation in computation and connectivity; these were text-based realms described using exotic, cryptic acronyms like MOO, MUD, and BBS.³ People used command-line programs, such as “talk,” on the Unix operating system to communicate in real time long before anyone had ever heard of texting. They sent messages to one another in a new form called email that, at one time, made up the majority of data transmissions crossing the internet’s networks. Others shared news, debated politics, discussed favorite music, and circulated pornography in Usenet news groups. All were virtual communities of sorts, connecting computer users to one another years before the birth of Facebook’s founders. Each site of communication developed its own protocols, its own widely accepted practices, its own particular flavor, social norms, and culture.

Because access to internet-connected computers was not commonplace in the first decades of its existence (there was not even a preponderance of personal computers in everyone’s home at this time), access to this nascent internet was largely the province of people affiliated with universities or research and development institutes, largely in the United States, and also in Great Britain and northern Europe.⁴ Despite the seemingly homogeneous backgrounds of these early users, people found plenty about which to disagree. Political, religious, and social debates, lengthy arguments, insults, trolling, and flame wars all were common in the early days—even as we continue to struggle with these issues online today.

To contend with these challenges, as well as to develop and enforce a sense of community identity in many early internet social spaces, the users themselves often created rules, participation guidelines, behavioral norms, and other forms of self-governance and control, and anointed themselves, or other users, with superuser status that would allow them to enforce these norms from both a social and technological standpoint. In short, these spaces moderated users, behavior, and the material on them. Citing research by Alexander R. Galloway and Fred Turner, I described the early social internet in an encyclopedia entry as follows:

The internet and its many underlying technologies are highly codified and protocol-reliant spaces with regard to how data are transmitted within it, yet the subject matter and nature of content itself has historically enjoyed a much greater freedom. Indeed, a central claim to the early promise of the internet as espoused by many of its proponents was that it was highly resistant, as a foundational part of both its architecture and ethos, to censorship of any kind. Nevertheless, various forms of content moderation occurred in early online communities. Such content moderation was frequently undertaken by volunteers and was typically based on the enforcement of local rules of engagement around community norms and user behavior. Moderation practices and style therefore developed locally among communities and their participants and could inform the flavor of a given community, from the highly rule-bound to the anarchic: The Bay Area–based online community the WELL famously banned only three users in its first 6 years of existence, and then only temporarily.5

With this background in mind, the reader should view the commercial content moderators introduced throughout this book in the context of an internet that is now more than ever fundamentally a site of control, surveillance, intervention, and circulation of information as a commodity. Content moderation activity and practices have grown up, expanded, and become mission-critical alongside the growth of the internet into the global commercial and economic juggernaut that it is today. I have experienced this transformation taking place over the past two decades firsthand, and it has played a central role in my own life, which I will revisit briefly here to provide context for the internet as it exists today, versus what it used to be.

Summer 1994, Madison, Wisconsin

In the summer of 1994, I was an undergraduate at the University of Wisconsin–Madison, pursuing a double major in French and Spanish language and literature. Despite the proclivity for the humanities that I displayed in my choice of academic pursuits, a longtime fascination with computers, coupled with my newfound interest in internet Bulletin Board Systems (BBS), or text-based online communities (accessed from my dorm room over a 14.4 kilobits-per-second modem and tying up the phone line constantly, much to my roommate’s annoyance), meant that I possessed just enough computer skills to retire from being a dishwasher in my dorm’s basement cafeteria and move into the relatively cushy job of computer lab specialist in the campus’s largest and busiest computer lab. Before laptops were affordable or even portable, and prior to ubiquitous Wi-Fi, the lab, on the ground floor of the university’s graduate research library, was an impossibly heavily trafficked place. Many of the university’s forty thousand undergrads were familiar to me because they had likely, at some point, camped out in front of a workstation in our lab.

One day, I reported for work and caught up with my colleague Roger as he strolled the floor of the lab. We stopped to contemplate a row of Macintosh Quadras (of the famous “pizza box” form factor) as they churned and labored, mostly unsuccessfully, to load something on their screens. The program’s interface had a gray background and some kind of icon in the upper corner to indicate loading was in progress, but nothing came up. (In those days, this was more likely caused by a buggy site or a choked network than anything inherent in the computer program that was running.) After a few moments of watching this computational act of futility, I turned to Roger, a computer science major, and asked, “What is that?”

“That,” he replied, gesturing at the gray screens, “is NCSA Mosaic. It’s a World Wide Web browser.”

Seeing my blank look, he explained with impatient emphasis, “It’s the graphical internet!”

My disdainful response was as instantaneous as a reflex. “Well,” I pronounced, with a dismissive hand wave, “that’ll never take off. Everyone knows the internet is a purely text-based medium.”

With that, I sealed my fate as perhaps the person more wrong about the future of the internet than anyone had ever been before. Shortly thereafter, the graphical internet, in the form of the World Wide Web, or simply the Web, as it came to be known, did take off—to say the least. This changed the experience and culture of personal computing irrevocably. Internet connectivity went from being a niche experience available to a designated few cloistered in universities and international research-and-development facilities, with denizens who were computer and tech geeks, engineers and computer science majors, to an all-encompassing medium of commerce, communication, finance, employment, entertainment, and social engagement. Although the internet-fueled tech sector crested and fell through boom-and-bust cycles over the next two decades, the internet, and the platforms that emerged to exist on it, became a part of everyday life. Internet access expanded, going commercial, mobile, and wireless. The American economy became tied to its successes, its excesses, and its crashes.

On April 23, 2013, the IEEE Computer Society celebrated the twentieth anniversary of NCSA Mosaic. This web browser, developed at the University of Illinois National Center for Supercomputing Applications (NCSA), was distributed free of charge, and its graphical user interface, or GUI, and focus on graphical display of information were largely credited with sparking widespread interest in and adoption of the World Wide Web.

Meanwhile, my own experience of online life transformed from something I was loath to talk about in mixed company, due to the numerous explanations and excuses I had to make to explain my time spent logged in to something esoteric and unknown. Alongside the increase in local internet service providers (or ISPs) and the ubiquity of America Online starter kits (first on floppy disks and then CD-ROMs), use of the internet as a social and information tool was developing into something more commonplace and understood. In the years that followed, Amazon, Friendster, MySpace, Facebook, and Google brought the internet out of the province of the nerd elite and into everyday life.

I have told the story of my epic prognostication failure numerous times over the years, to colleagues and students alike, using it as a pointed illustration of the dangers of becoming so embedded in one’s unique experience of a technology that imagining other iterations, permutations, or ramifications of it becomes impossible. A lack of perspective is dangerously myopic for a technologist, student, or scholar engaged in the study of digital technologies, all identities that I inhabited in the twenty-five years following my initial encounter with NCSA Mosaic. Over the years, I have therefore often avoided making predictions about the ways in which technological developments might or might not transpire or take hold. Yet I return to this story now with a new perspective—one that is somewhat more gracious about the shortcomings of my observations twenty-five years ago. Perhaps what I sensed that summer day in the computer lab was discomfort brought about by a concern for what I felt represented a massive change in how the internet was used and culturally understood.

In 1994, the internet still held, for me, the great promise and potential of a nascent communication form and a powerful information-sharing platform. I found solace in its disembodied nature. It was a place where one could try on different identities, points of view, and political stances. In the social spaces I inhabited online, participants were judged not on how they looked or by their access to material resources, but by how well they constructed their arguments or how persuasively they made a case for the position they advocated. This allowed me, for example, to experiment with identifying as gay well before I was able to do so “IRL” (in real life, in internet parlance). Having had the opportunity to textually partake of the identity made the real-world embodiment much easier than it might have been. What might, therefore, be lost in an internet that was no longer characterized by text, but by image? Even when the public Web was still embryonic, I feared that the change would likely lead to commercialization, and with it, digital enclosure and spaces of control. Indeed, unlike my first and wildly erroneous earlier prediction about the unlikelihood of adoption of the World Wide Web, that one has largely come true.

Even in spite of my own early privileged and mostly positive engagements online, and what I and others viewed as the internet’s potential as a space for new ways of thinking and doing, all was not ideal in the pre-commercialization explosion of the Web. Although champions of “cyberspace” (as we poetically referred to it at the time, inspired by William Gibson’s cyberpunk fiction) often suggested limitless possibilities for burgeoning internet social communities, their rhetoric frequently evidenced jingoism of a new techno-tribalism, invoking problematic metaphors of techno–Manifest Destiny: pioneering, homesteading, and the electronic frontier.6

Other scholars, too, identified a whole host of well-worn real-world “-isms” that appeared in the cyberworld, just as endemic to those spaces, it would seem, as they were in physical space. Lisa Nakamura identified the “hostile performance[s]” of race and gender passing in online textual spaces in her article “Race In/for Cyberspace: Identity Tourism and Racial Passing on the Internet,” in 1995. Legal scholar Jerry Kang and sociologist Jessie Daniels have also contributed key theoretical takes on the deployment of racism in the context of online life, despite its being heralded by many proponents as color-blind.7 By 1998, Julian Dibbell had recounted the bizarre and disturbing tale of anonymity-enabled sexual harassment on LambdaMOO in “A Rape in Cyberspace,” the first chapter of his book on emergent internet social experiences, My Tiny Life, and Usenet news groups were characterized by arcane, extensive rules for civil participation and self-governance, or by the opposite, hostile flames and disturbing content serving as the raison d’être of a group’s existence.⁸ By 1999, Janet Abbate was helping us to understand the complexity of the formation of the internet by computer scientists, the U.S. military, and academics, in partnership with industry.⁹ Gabriella Coleman has illuminated the important role of hackers and others shaping the internet outside the conscripted boundaries of easily understood legal and social norms.¹⁰

In 1999, legal scholar Lawrence Lessig burst onto the burgeoning internet studies and cyberlaw scene with his accessible best-selling monograph Code, and Other Laws of Cyberspace.¹¹ In it, he tackled issues of content ownership, copyright, and digital rights from a decidedly pro-user, open-access perspective as they related to the internet. When the open-source movement was taking on greater importance, with successes such as Linux moving from a fringe hobbyists’ operating system to a business-ready, enterprise system (such as when RedHat went public) and in the wake of the rise and dramatic fall of Napster, consternation was growing among internet users about the legal perils and liabilities of file-sharing and other kinds of internet use. In his text, Lessig addressed these issues head-on, arguing for greater openness and access to information as a potential source of creativity and innovation. He warned against the dangers of continued encroachment of digital rights management (DRM), media conglomeration, like the then-recent AOL–Time Warner merger, and a variety of moves that he saw as a threat to the free circulation of information on the internet. Others, such as legal scholar Jamie Boyle, discussed the need for expansion and protection of the digital commons, a historical allusion to and metaphor based on the closing of the physical commons of sixteenth-century England.12

This fledgling social internet was therefore not without problems, not least because it was a rarefied and privileged space. Yet it was during this early period that everyday use of the internet by commercial entities, government agencies, students, and lay users also began to grow at massive levels. The Pew Research Center’s Internet & American Life Project characterized the growth as facilitated by access to three interrelated technology sectors: broadband, mobile internet-enabled devices, and social media platforms. Scholars, including Lessig and Boyle, internet activists, and organizations (such as the Electronic Frontier Foundation, or EFF) focused on concerns over the potential for increased surveillance and control by corporate and government entities, enabled by the very same technologies that allowed Americans to get online and stay online—for work and for leisure—in unprecedented numbers.

Historically, legal jurisdiction had functioned in direct relationship to the geographic and political borders that have defined a particular region or state. These had been commonly understood and recognized by those subject to their laws, thus allowing for consent of the governed necessary for the enforcement of laws. The development of national and international media of various types (such as newspapers and radio) certainly may have challenged the notion of clear-cut borders, but not nearly to the extent and without as much fanfare (or threat) as the internet did in its early transition to a major consumer, commercial, and social media site of engagement.

For many of its proponents, the promise of the early internet was that it knew no geographic boundaries; it seemed to transcend international borders and to exist in a space that was both geographically territory-less and its own distinct location simultaneously. The internet was paradoxically nowhere and everywhere, constituting a brave new borderless world and suggesting, among other things, an untapped and exciting potential to many for access to ideas and speech that, in some areas, was otherwise precluded by the state. Early cyberlibertarian/technologist John Gilmore, for example, was famously quoted as saying that the very architecture of the internet was structurally immune (or at least highly resistant) to any censorship of information, accidental or otherwise, that traveled through its interconnected nodes. Another early internet luminary, John Perry Barlow, famously issued a Declaration of the Independence of Cyberspace that actively challenged and rejected traditional government control, legislation, and jurisdiction over the internet.13 Large internet companies even claimed that not only was it impractical but it was, in essence, technologically infeasible to attempt to restrict access or content based on users’ geographic location (and, hence, their legal jurisdiction)—a claim later famously disproven in a court case that led to geolocation and content limiting based on IP address.¹⁴

Yet today the vast majority of what most people consider “the internet” is, in fact, the province of private corporations over which they can exercise virtually no control. These companies are often major transnational conglomerates that enjoy close relationships with the governments of their countries of origin. This privatization occurs across all levels of connectivity and access, from the backbones that connect the world’s networked computers, of which there are only five major ones (with those second- and third-tier backbones largely in the hands of only a few transnational media or communications conglomerates) providing access to content that is delivered within privately held platforms.15

Commercial content moderation is a powerful mechanism of control and has grown up alongside, and in service to, these private concerns that are now synonymous with “the internet.” It is part and parcel of the highly regulated, mediated, and commercialized internet of platforms and services owned by private corporations. The practice of it is typically hidden and imperceptible. For the most part, users cannot significantly influence or engage with it, and typically do not know that it is even taking place, themes raised in interviews with current and former commercial content moderators contained in this book. To fully understand the workers’ insights, the context provided in this chapter serves as the backdrop to the environment in which all of the workers I spoke to operate.

The narrative contained in this book represents the first eight years of a scholarly endeavor, because this work represents the research agenda of a lifetime. Who I am in terms of my own positionality—identities, life experiences, perceptions, and other facets of self—is a key part of the story of what I have uncovered. My own experiences online, which have mirrored the development and adoption of the commercial internet as a part of everyday life, and my subsequent work as an information technologist, have driven my interest in the phenomenon of commercial content moderation and the lives of its workers.

My first experiences with nascent online communities—characterized by tedious and often contentious self-governance, voluntary participation and veneration of status-holding and power-wielding community leaders, predisposition to the primacy of computer geek culture, and DIY sensibilities (it was not uncommon for a system to be hosted, for example, on a cast-off mainframe computer in someone’s closet)—have contextualized and framed my approach to my own life online. Years later, my own early experiences were put in sharp relief as I read, for the first time, about a group of Iowa-based content screeners, one state over—people who probably looked very much like me, and whose lives, like mine, revolved around the internet. But in the twenty-five years since I first logged on, the landscape had changed dramatically. Online work has gone from being niche employment for a select Bay Area few to the norm for millions, and the promise of the digital economy was at the center of federal technology and employment policy, as well as the aspiration for many. Commercial content moderation is a job, a function, and an industrial practice that exists only in this context and could only ever exist in it.

Today, important insights into how we perceive the contemporary internet have been made by digital sociologists Jessie Daniels, Karen Gregory, and Tressie McMillan Cottam, as well as researchers from legal, communications, and information studies traditions like Danielle Citron, Joan Donovan, Safiya U. Noble, Sarah Myers West, danah boyd, Siva Vaidhyanathan, Zeynep Tufekci, and Whitney Phillips, among others, who study the impact of online hate, hostility, and the role of social media platforms in fostering ill effects on individuals, societies, and democracies. It is my hope that this book will complement and add to this important dialogue and will serve to both enrich and complicate our understanding of online life today.16

Chapter 1 is the tale of when the work of professional internet moderators first significantly came to light beyond the confines of industry, thanks to a key article in the New York Times, and my own introduction to it. I recount this moment in 2010 and draw links to other areas of obfuscated human labor and intervention in the internet ecosystem in order to describe the scope and stakes, then and now, for commercial content moderation and its impact.

The concept of commercial content moderation, and contexts in which people undertake this work, are described in detail in Chapter 2. This chapter maps the concept of content moderation contextually and theoretically, and develops a taxonomy of the practice, introducing cases that exemplify the practices and conditions of commercial content moderation. This helps situate it within historical and contemporary discussions of digital labor and the digital economy writ large, providing examples of recent high-profile cases with a concomitant analysis.

Chapter 3 introduces three workers who are employed as contractors in a major Silicon Valley internet giant pseudonymously referred to as MegaTech. Featuring the workers largely in their own words, the chapter describes the workplace culture and daily experiences of contract workers in the Valley environment. The workers speak about the stress and negative effect of their jobs on their lives at work and home. I argue that the workers’ insights into the nature of content moderation labor reveal a complex and problematic underside to the social media economy, and to broader notions of internet culture and policy. The workers are remarkably self-aware and perceptive, and the chapter captures the richness of their voices and experiences by including many powerful excerpts from interviews with them, along with my own analysis.

Earlier studies of similar low-wage, low-status fields (for example, call centers) or of work involving screening tasks exist and have proven instructive in framing the analysis for this book. In her study of airport security screeners, Lisa Parks cites a congressional hearing in which work involving relentless searching and screening via video was described as a “repetitive, monotonous and stressful task that requires constant vigilance.”17 In the case of content moderators, not only is their work likely to be monotonous, but it also frequently exposes them to disturbing images whose hazards go unnoticed because they are not necessarily physically apparent, immediate, or understood.

Following the focus on contractors at Silicon Valley’s MegaTech, Chapter 4 returns to the work lives of moderators in two additional and distinct contexts: one working in, and managing, a boutique social media specialty firm and another supplying contract labor for an online digital news site. Workers featured in this chapter contribute important insights into their specific contexts, elucidating both the divisions in commercial content moderation as it is undertaken in distinct organizational contexts while drawing links among experiences and observations among people who do this work worldwide.

In Chapter 5, I focus on the work and life of a group of moderators in Manila, Philippines. In 2013, the Philippines surpassed India, at a fraction of its size in population, as the call center capital of the world. Filipino workers, much like others who work in call center environments globally, must perform cultural and linguistic competencies every day as they vet content originating in and destined for very different parts of the world from where they are located. This chapter offers the comparative case of commercial content moderation in the Philippines to argue that this phenomenon, as outsourced to the Global South, is a practice predicated on long-standing relationships of Western cultural, military, and economic domination that social media platforms exploit for inexpensive, abundant, and culturally competent labor. It describes the experiences of five Filipino content screeners, in their own words against a historical and contemporary backdrop that familiarizes the reader with their work life in modern Manila.

In Chapter 6, the book concludes with an informed yet speculative look toward the future of content moderation and the future of digital work in general. It discusses where commercial content moderation practices may be heading, in terms of regulation and other pressures on social media firms for greater accountability and transparency. It also addresses platforms’ claims of the ability of artificial intelligence to supersede human moderation. I argue that while social media firms may no longer be able to conceal the active intervention of content moderation workers, it is not clear that merely bringing their activities into the light will result in a better workplace environment or improved status. I suggest that this hidden cost of social media use may well be endemic to the platforms, with a bill that will come due at some point in the future in the form of damaged workers and an even more frightening social media landscape. This chapter closes with an overview of the current state of commercial content moderation, including legal and policy developments in a number of European countries, such as Germany, Belgium, and Austria, and at the European Union level, that push back on major platforms’ unilateral management of content moderation; discussion of landmark lawsuits involving commercial content management workers at Microsoft and now at Facebook; and an analysis of the implications of the larger social uptake and public consciousness of commercial content management.

Behind the Screen is a long-form overview of the commercial content moderation phenomenon that cuts in at the level of the personal, focusing on work lives of those who implement social media policy and who are on the front lines. It is long in its gestation and covers significant chronological and theoretical ground, but it is not a definitive or final statement. Rather, it enters into a dialog under way in the hopes of complementing extant work by a variety of academics and advocates focused on commercial content moderation, specifically, and content moderation in general. That dialog includes voices concerned with content moderation and legal perspectives, human rights and freedom of expression, platform governance and accountability, the future of the internet, among many other points of view. Key works in this area are published or forthcoming by Kate Klonick, James Grimmelmann, Tarleton Gillespie, Sarah Myers West, Nikos Smyrnaiois and Emmanuel Marty, Nora A. Draper, Claudia Lo, Karen Frost Arnold, Hanna Bloch-Wehba, Kat Lo, and many others.18 My own thinking and conceptualizing have been greatly enriched by their work, and I encourage all interested readers to seek it out.