Chapter 01

Behind the Screen

Summer 2010, Champaign, Illinois

During the hot and muggy summer of 2010, iced latte in one hand, I idly clicked my way through the New York Times online. It was a welcome, fleeting break from my doctoral studies at the Graduate School of Library and Information Science at the University of Illinois. A public land grant institution, the university dominates the economy and geography of the small twin towns of Urbana and Champaign that yield at the towns’ perimeter to endless cornfields on all sides. That part of Illinois is precarious to the health of anyone with seasonal allergies or put off by agricultural scenes.

I was spending the summer there, in oppressive heat and humidity, working as a teaching assistant and doing an independent study on digital media. Taking a break from grading and course prep, I dipped into my brief daily diversion, perusing the news stories. A small story relegated to the Tech section of the paper, “Concern for Those Who Screen the Web for Barbarity,” commanded my attention.1

The reporter, Brad Stone, set the stage: a call center company called Caleris, providing business process outsourcing (BPO) services in a similarly agricultural and nearby Iowa town, had branched into a new kind of service.2 Content screening, or content moderation, was a practice wherein hourly-wage employees reviewed uploaded user-generated content (UGC)—the photos, videos, and text postings all of us create on social media platforms—for major internet sites. The New York Times piece focused on the fact that the Caleris workers, and those at a few other content moderation centers in the United States, were suffering work-related burnout triggered by the disturbing images and video they were reviewing. This material often included scenes of obscenity, hate speech, abuse of children and of animals, and raw and unedited war-zone footage.

Some other firms specializing in content screening, the article stated, had begun offering psychological counseling to employees who found themselves disturbed and depressed by the material they were viewing. For a mere eight dollars per hour, workers were subjected to disturbing content that users had uploaded to social media platforms or websites. The material was so upsetting that many employees eventually pursued psychological counseling. As I read the article, it became clear how this new type of tech employment would be a necessity for social media platforms, or for any companies asking for feedback or a review of their product on the open Web. The need for brand protection alone was compelling: no company would want to solicit content from unknown and generally anonymous internet users without a means to intervene over objectionable, unflattering, or illegal material that might be posted. Yet, despite almost twenty years, at that point, of active participation online as a user, information technology worker, and ultimately internet researcher, until that moment in 2010 I had never heard of such workers or even imagined that they existed in an organized, for-pay way such as this.

I forwarded the article to a number of friends, colleagues, and professors, all longtime internet users like me, and digital media and internet scholars themselves. “Have you heard of this job?” I asked. “Do you know anything about this kind of work?” None of them had, although a few had also seen the Times story before I sent it to them. They, too, were transfixed. For a bunch of net nerds and digital media geeks, we knew shockingly next to nothing about this practice, which I came to call commercial content moderation—to differentiate this labor from the familiar types of volunteer governance and self-policing activities that had commonly been performed in online spaces and communities for years.

Yet, once I gave it significant thought, it seemed obvious that such a practice must exist on commercial platforms that rely on user-generated content to draw and sustain audiences. In 2014, users uploaded more than one hundred hours of video per minute to YouTube alone. Far surpassing the reach of any cable network, YouTube disseminated its content to billions around the globe. By 2015, YouTube uploads had increased to four hundred hours per minute, with one billion hours of content viewed daily as of 2017.3 By 2013, news outlets reported that 350 million images per day were being uploaded to Facebook.⁴ Over the past decade, the increase in scope, reach, and economics of these platforms and their concomitant labor and material impacts have been traced in influential works by scholars such as Nick Dyer-Witheford, Jack Linchuan Qiu, Antonio Casilli, Miriam Posner, and many others.⁵ By 2018, simple math based on the number of user-content-reliant social media platforms and other content-sharing mechanisms in the marketplace—such as Snapchat, Instagram, One Drive, Dropbox, WhatsApp, and Slack—by a rough estimate of their user bases, or the amount of content generated per day, makes it clear that handling such an influx has to be a major and unending concern, with a massive global workforce to contend with it and a massive global supply chain of network cables, mineral mining, device production, sales, software development, data centers, and e-waste disposal to support it.

When I read that article in 2010, neither YouTube nor Facebook had been so widely adopted yet by users. Nevertheless, even by that point those platforms and their mainstream online peers had already captured audiences of hundreds of millions worldwide, delivering ever-updated content to them in exchange for their eyeballs, clicks, and attention to partner advertising running next to their hosted videos and images. It stood to reason that the YouTubes, Facebooks, Twitters, and Instagrams of the world would not simply let content flow unfettered over their branded properties without some form of gatekeeping. But, as I reflected at greater length, notwithstanding the lack of public discussion about the platforms’ filtering practices, it seemed obvious that decisions made by content screeners working in obscurity in office cubicles had the potential to markedly affect the experience of the users who created most of the content on these platforms.

The pursuit of answers to those questions has sustained, driven, and frustrated me for eight years, taken me on a trajectory around the world, from North America to Europe to the Philippine megalopolis of Manila. It has put me in contact with workers, management, advocates, artists, and attorneys; put me periodically at odds with powerful firms and entities and at other times in the same room; and given me a voice with which to speak about the realities of commercial content moderation work to audiences small and large.

The Hidden Digital Labor of Commercial Content Moderation

After years of obscurity, commercial content moderation and the material realities of what the employees performing this work have to endure have more recently made international headlines. In the wake of the 2016 American presidential election, the role of social media platforms and the information they circulate online has been questioned by a public concerned, for the first time in significant numbers, about the way social media content is produced. The term “fake news” has been introduced into the general discourse. In 2017, commercial content moderation became a hot topic after a series of highly publicized and violent, tragic events were broadcast, in some cases live to the world, on Facebook and other social media platforms. These events raised questions in the public sphere about what and how material circulates online and who, if anyone, is doing the gatekeeping.

To the surprise of much of the public, as it was to me in 2010, we now know that much of the labor of these adjudication processes on platforms is undertaken not by sophisticated artificial intelligence and deep learning algorithms, but by poorly paid human beings who risk burnout, desensitization, and worse because of the nature of their work. Facebook, YouTube and other huge tech platforms have been beset by leaks in the pages of major newspapers from disgruntled content moderators eager to make the public aware of their role and their working conditions, and the limitations and impact of social media and algorithms are increasingly studied by computer and social scientists and researchers, such as Taina Bucher, Virginia Eubanks, Safiya Noble, and Meredith Broussard.6

After so many years in the shadows, it seems that the phenomenon of commercial content moderation—the intertwined systems of the outsourcing firms that supply the labor, the giant platforms that require it, and the people who perform the filtering tasks that keep so many vile images and messages away from users’ screens—is having a moment. That the topic has seized the public’s consciousness across academic, journalistic, technological, and policy-making sectors has come largely in spite of continued opacity, obfuscation, and general unwillingness to discuss it on the part of social media firms that rely on these practices to do business.

To penetrate the tech corporations’ stonewalling about the cloaked interventions they make that determine what you ultimately see on your screen, I realized that I would have to get in touch with the moderators themselves. Locating the moderators, and persuading them to speak about their work in light of the legal restrictions placed on them by their employers, has been challenging but crucially important. The process has been complicated by the job insecurity of many content screeners, the barrier of nondisclosure agreements, which they are typically compelled to sign, and the term-limited positions that cause them to cycle in and out of the workforce. An additional challenge was their multiple locations within a worldwide network of outsourcing firms that sees flows of digital labor tasks circumnavigate the globe under many monikers and far removed from the platforms ultimately reliant upon them.

Throughout the interview process, I took pains not to exacerbate the difficult experiences of commercial content monitors by asking sensationalistic or voyeuristic questions, but instead posed interview questions crafted with care designed to allow the workers to express what they felt was most important about their work and lives. I have been eager to share the importance of their work, their own insights and awareness, with a larger audience. This empirical research, with content moderation workers themselves, and my subsequent analysis and reflections have evolved into this book.

Over time, my research and its outcomes have shifted and evolved, as have the aims of the book, which are multiple. Aad Blok, in his introduction to Uncovering Labour in Information Revolutions, 1750–2000, notes that scholarly discussions of the evolutions and revolutions in information and communication technology have tended to ignore the concomitant developments in labor practices. He further notes that if “in this respect, any attention is given to labour, it is focused mainly on the highly skilled ‘knowledge work’ of inventors, innovators, and system-builders.”7 This book aims to avoid such an oversight in the case of commercial content moderation workers by instead foregrounding their contributions alongside other knowledge work that is typically better known and accorded higher status. Recent years have seen many new and important additions to the critical literature on new forms of labor necessitated or predicated upon the increasing reliance on the digital, with many important new studies on the horizon.

I subsequently make the case for what this type of moderation work can tell us about the current state of the internet, and what the internet is actually in the business of. Indeed, if there is a central argument upon which this book hinges, it is my belief that any discussion of the nature of the contemporary internet is fundamentally incomplete if it does not address the processes by which certain content created by users is allowed to remain visible and other content is removed, who makes these decisions, how they are made, and whom they benefit. Indeed, the content screeners I spoke with recognize the centrality of their work in creating the contemporary social media landscape that, as users, we typically take for granted, as if the content on it must be there by design, because it is “best,” has been vetted, or has inherent value. The moderators themselves tell a different story. Their tale is often a paradoxical one, characterized by the difficult and demanding work they do to curb the flow of problematic, objectionable, and illegal material, all the while knowing that they are touching only a drop in the bucket of the billions of monetized videos, images, and streams connecting us to one another, and delivering us to advertisers, on social media platforms.

Ghosts in the Machine: On Human Traces in Digital Systems

A certain myopia characterizes collective engagement with social media platforms and the user-created content they solicit and disseminate. These platforms have long traded on a predominating origin myth of unfettered possibility for democratic free expression, on one hand, and a newer concept of unidirectional, direct user-to-platform-to-dissemination media creation opportunities offered by “Web 2.0” and social media platforms, on the other (witness YouTube’s on-again, off-again slogan, “Broadcast Yourself”). And yet, the existence of commercial content moderation workers and the practices they engage in certainly challenge the end user’s perceived relationship to the social media platform to which she or he is uploading content. For end users, that relationship is a simple one: they upload content and it appears to the world. In reality, the content is subject to an ecosystem made up of intermediary practices, policies, and people whose agendas, motivations, allegiances—and mere existence—are likely imperceptible and therefore not taken into account when that user clicks “upload.”

But once the existence of these intermediary agents becomes known, what else might be up for reevaluation? What other practices are worthy of another critical glance to identify the human values and actions embedded within them, and how does recognition of them change our understandings of them? In her book Algorithms of Oppression, Safiya U. Noble demonstrates how Google’s search function structures harmful representations of gender, sexuality, and race.8 Miriam Sweeney asks questions about the nature of human intervention and embedded value systems in the creation of computerized digital assistants known as anthropomorphized virtual agents, or AVAs.⁹ Rena Bivens documents in a ten-year study the impact of the gender binary in Facebook, and its ramifications for gender expression and choice.¹⁰ Whose values do these platforms actually reflect? Whom do these tools and systems depict, how and to what end?

Critical interventions intended to complicate or unveil the politics and humanity of digital technology are present at the nexus of art, activism, and scholarship. Andrew Norman Wilson, one such person working at that intersection, has sought to show human traces within digital processes by revealing the literal handiwork of Google Book scanners. As a one-time Google-contracted videographer, he noticed a building on the Google campus that a group of workers streamed into and out of at odd hours of the day, segregated from the rest of those located on Google’s sprawling property in Mountain View, California. He learned that these workers were contracted to produce the millions of page scans needed for Google’s massive book digitization project—a project that, in its final production-ready iteration, betrays no sign of these human actors involved in its creation. When Wilson probed too deeply into the lesser working conditions and status of the scanners, he was fired, and Google attempted to confiscate the video recording he had made of the workers, which, in homage to the cinematic and documentary endeavors of the Lumière Brothers’ Workers Leaving the Lumière Factory in 1895, he titled Workers Leaving the Googleplex.11 In this way, he linked the status and work conditions of the book scanners on Google’s campus directly to those of factory workers—a much less glamorous proposition than work at Google is generally depicted as being.

It is the erasure of these human traces, both literally and in a more abstract sense, that is so fascinating, and we must constantly ask to whose benefit such erasures serve. As for the Google book scanners, their human traces have been revealed, preserved, and celebrated as a sort of found art, through the work of Wilson, as well as through a popular Tumblr blog created by an MFA student at the Rhode Island School of Design (and even lauded in that bastion of American highbrow culture the New Yorker).¹² The Tumblr blog presents a veneration of Google Books errata that is both entertaining and thought-provoking, revealing everything from fingers caught in the act of scanning to notes in the margins to scanning misfires that result in new permutations of texts.13

Wilson’s work and the Tumblr blog, however, still reveal not much more than a hint of the human trace behind what most people assume is an automated process. Both still leave us in the dark about who made the traces and under what conditions.

In Closure

As individuals and, on a larger scale, as societies, we are ceding unprecedented amounts of control to private companies, which see no utility or benefit to providing transparent access to their technologies, architectures, practices, or finances. It becomes difficult, if not impossible, to truly know and be able to hold accountable those we engage to provide critical informational services. The full extent of their reach, their practices with the data and content we generate for them, and the people who engage with these materials are almost always unknown to and unknowable by us. By dint of this opaque inaccessibility, these digital sites of engagement—social media platforms, information systems, and so on—take on an oracle-like mysticism, becoming, as Alexander Halavais calls them, “object[s] of faith.”¹⁴

Yet technologies are never neutral, and therefore are not “naturally” benign or without impact. On the contrary, they are, by their nature as sociotechnical constructions, at once reflective of their creators while also being created in the service of something or someone—whether as designed, or when as reimagined or repurposed in acts of adaptation or acts of resistance. This fact therefore begs for a thorough interrogation of the questions “Who benefits?” and “What are the ramifications of the deployment and adoption of these technologies for the accumulation and expansion of power, acculturation, marginalization, and capital?” We know, for example, that the histories of technological development have been subject to structural exclusion of women, as historian Mar Hicks documents in her work on the purposeful exclusion of women from computer programming in the United Kingdom, which subsequently eroded the potency of the British in the global rise of computing.15 Similarly, systematic racial and gender discrimination at AT&T in the United States adversely impacted the bourgeoning Black and Latina workforce, precluding these early tech workers from meaningful long-term careers in telephony, computing, and technology, as chronicled by scholars Venus Green and Melissa Villa-Nicholas.¹⁶

Social media platforms, digital protocols, and computer architecture are all human constructs and human pursuits, embedded with human choices and reflecting human values. Media scholar Lev Manovich poses this challenge: “As we work with software and use the operations embedded in it, these operations become part of how we understand ourselves, others, and the world. Strategies of working with computer data become our general cognitive strategies. At the same time, the design of software and the human-computer interface reflects a larger social logic, ideology, and imaginary of the contemporary society. So if we find particular operations dominating software programs, we may also expect to find them at work in the culture at large.”¹⁷

Many scholars and activists, such as those cited and discussed here, have dedicated great amounts of intellectual labor and written words to questioning the practices and the policies that render systems opaque and that result in our information as commodities in the digital system. Less is known, though, about the human actors who labor as intermediaries in these systems. After all, activists, scholars, and users can address only what they can see and know, or at least imagine, and what they can engage with. This introduction to the broad issues surrounding commercial content moderation is, therefore, intended to contextualize some of the conditions of the contemporary internet: its histories, its affordances, and its absences. Commercial content moderators are intermediaries who negotiate an internet defined in the terms discussed here; they manipulate and referee (frequently in secret) user-generated content in social media platforms to the end of creating spaces that are more palatable, accessible, and inviting, and that elicit more user participation. They do so for money, and they do so in terms of the goals and benefits of the companies that engage their services. While a better user experience may be an outcome of the work they do, this is always, and ultimately, because that better experience benefits the company providing the space for participation online.