`13 Fake News Fingerprints`

Dan Faltesek

Is Chelsea Clinton a Satanist? What about Hillary Clinton, for that matter? As Susan Faludi recalled in an editorial advocating for Senator Clinton in the New York Times, the idea of the Clinton family, particularly Hillary Clinton, as being in league with the Devil is not new: it was among the many conspiracies floated around the 1996 Republican National Convention.¹ In the days leading up to the 2016 election, accusations of Satanism against the Clintons again appeared. The satanic canard even makes an appearance in the Robert Mueller indictment.² It is not surprising that thousands of tweets also made the accusation and it is even a point worthy of further investigation.

But why would someone go to the trouble of flooding an entire network with false content of moderate interest? The Satanism story has some shock value, but is easily dismissed by all but the most ardent of Clinton haters. There is a strategic purpose to the deployment of this content, and the networks that make these floods possible are detectable. This would seem to be the heart of the fake news crisis: not the confusion that one might feel on seeing a really convincing article on ClickHole or The Borowitz Report, but particularly strategic action that manipulates the formation of publics.³

In this chapter, I consider the dynamics of two distinct fake news stories and the networks that promoted them: the WikiLeaks Democratic National Committee (DNC) e-mail release and the accusation of Satanism. These are two distinct data sets that are useful for different reasons. WikiLeaks intervened in the election at a key point just a week before the vote; the goods it brought to the table were the e-mails of DNC chairman John Podesta. From pasta sauce to strategy, the e-mails were his.⁴ Dumped en masse, this effort to intervene in the election was repeated in France during the race between Emmanuel Macron and Marine Le Pen.⁵ As an example, this speaks to a standard coordinated playbook that relies on the campaign itself as an argument, with the hope that actors sifting through the data produce individual claims. The satanic data set is tiny by comparison, but useful, as the insights we can take from it are so clear, and the story itself is absurd.

To understand the fingerprints of a fake news network we need to distinguish between the types of fake news organizations and the ways that they operate. The literature reviewed in this chapter ranges from early journalism research into the meaning of fake news to election strategy, information theory, and computational work in social media. A broad view is important, as this chapter considers the decision to deploy a fake news network and the ideas necessary in that moment. After the literature review, the computational methods used in this study are documented. Each individual case is considered in an independent section, one for WikiLeaks, another for Satanism. The conclusions for the study for fake news follow.

`The Fake Game`

There are many ways of distinguishing between fake news, propaganda, and simple errors. Fake news is an extremely broad category. Edson C. Tandoc Jr., Zheng Wei Lim, and Richard Ling reviewed thirty-four articles, finding that the term was used to describe forms ranging from satire to advertising.⁶ Better and worse fake news items were then distinguished by the propensity for a source to be based on facts or to be intentionally deceptive. As much as the strength of the archive is appreciated, the typology starts one step too far into the definition: it accepts that all material studied under the sign fake news must be fake news. A rich, internally consistent satire is not fake news, nor is an advertisement for a sale on socks. This is to say, not that a tipped news article should not be studied as fake, but that many different and platform-specific strategies are needed in this area.

Similarly, models of exposure effect prediction, as demonstrated by Hunt Allcott and Matthew Gentzkow, who rely on assumptions about the exposure to the feed producing reliable results.⁷ The model of effects supposed in this research depends on the alignment of messaging. This research is important, as it is ultimately the public that votes, but it is still too far downstream, as it depends on resonance with individual voters and their fragile memories. If we look at the texts shared, rather than user perceptions, the story becomes more complicated. Considering the diffusion of stories is also important. Soroush Vosoughi, Deb Roy, and Sinan Aral use experimental methods to demonstrate that false information spreads more quickly than the truth and that botnets tend to amplify false and true stories at a similar rate.⁸ The implication is that humans have an important role in amplifying falsehood. Fake news as a category is too broad, the analysis of the effects is too difficult, and the distinction between robots and people is overrated.

Rather than attempt to analyze material that would be coded as fake or non-fake, we should turn to the structure of the games that shape the maneuvers of the producers of fake news. After all, we cannot simply call the staff of the Concord Catering Company; we can only evaluate the structure in which they made critical decisions.⁹ What kinds of games do they think they are playing? How do you win? What are the rules? Are there scores or pieces? It would seem reasonable to object to this line of analysis as trivializing something so important as democracy itself. The metaphor at the heart of game theory is intended, not to make light of important decisions, but to reveal how people play strategic games in everyday life. It would be difficult to imagine a campaign manager not having a game framework for his or her decision making. In the next section, I consider how information games are played and how these plays dovetail with what we already know about social media strategy.

`Information Theory and Election Strategy`

Elections are a powerful tool, as they produce a single winner. It is all or nothing; we can’t both win. Elections in the United States amplify these dynamics with a combination of candidate-specific ballot access, winner-take-all counting, regularly scheduled elections, and two parties. In these games, players work to minimize losses. This is different from a non-zero-sum game, where players work to maximize their payoff.¹⁰ Most important, any strategy that can prevent an adversary from winning becomes an opportunity for offense, a place to score and win the game.

To win an election, your candidate must receive the largest number of votes. This is a deceptively simple statement. It may not be entirely clear how to vote or how to interpret the documents or electronic impressions created by a voting system. There are two distinct plays, those that seek to maximize the number of voters for a candidate and those that seek to minimize votes for the other side; Donald Trump > Hillary Clinton is derived from the totals for each side of the Electoral College. Campaigns are playing both sides and often looking for strategies that have the highest probability of effect. “Get out the vote” (GOTV) operations are expensive and labor-intensive ways to secure a vote among those already with a propensity to vote.¹¹ Much of the strategy in this area depends on delivering reliable votes at a high yield. General efforts to turn out new or low-propensity voters are not a core strategy. Voter suppression strategies have a very high probability of effect and may appear as a key strategy for those who would seek to decrease votes for the other side. Dynamics of suppression can unfold both through formal structures and through the imaginary collectivity we call the public.

Publics form ad hoc and likely invisibly. Michael Warner’s insight is critical here—members of counterpublics are engaged by attention alone.¹² The public sphere is not the collection of interest groups that would engage institutions, but the groups of people or people who imagine themselves to be a part of groups considering how they might take action. It would be much easier to research publics if we limited our studies to formal interactions, but the loss would be in the complexity of information. This perspective has been important for communication research, as it emphasizes the ephemerality of public life and the emergence of counterpublics. It is conceivable, then, that a communication strategy could be formed that would preclude the formation of rival counterpublics by flooding the channel with information for one side or that would make it impossible for a transient signal to appear. An overlooked dimension of this theory as it relates to Twitter is the emphasis on the synthetic temporality of now. Publics are magical because they are both very real and imaginary at the same time.

During media events, when publics become more literal, tweeters become retweeters: Yu-Ru Lin, Brian Keegan, Drew Margolin, and David Lazer find that during media events, the total production of new content by tweeters decreases.¹³ Swarms of retweeters amplify already powerful voices. In terms of theorizing diffusion, Yini Zhang, Chris Wells, Song Wang, and Karl Rohe demonstrate the power of clusters of voices to amplify communication within Trump’s network.¹⁴ Networks have a powerful role in intermedia agenda setting: clusters, waves, cascades, and hoards that appear in the networks are likely newsworthy. Circularity ensues when the legacy coverage of the event drives social circulation. Democratization of the media seems in reach, as new voices, if attached to a cascade, could drive legacy coverage. They describe research in this area as developing a new paradigm for the study of amplification, and this chapter could be read as a contribution to such a project. This study employs a similar approach to the analysis of clusters and social network contributions. Users act increasingly as swarms and those swarms can be detected. Publics become bots.

Claude Shannon’s information theory is important for understanding these games.¹⁵ This signal/noise relationship is the basis of digital communication and at times an oversimplification of communication, but a useful simplification for understanding channels in operation. For Shannon, the problem of communication involves the translation of all meaning into discrete units, rather than the contiguous flows of human experience. Entropy is the tendency of a signal to become disorderly. All information sources have a dimension of entropy; only those that appear in toy games or examples are perfect. In the example of an automatic text-generating machine, it became clear that a cybernetic process could, when driven by the seed of regular use tables and a number generator, produce meaningful yet meaningless strings of text. The randomness would naturally introduce some entropy, but there would still be enough signal. It is entirely possible that a receiver accessing this text could mistake the nonsense for a message. This depends on framing, as not all finite character groups assembled in this way are semantically meaningful. On a more basic level, Shannon demonstrates that the capacity of a channel depends on the clearing of noise. As message complexity increases, vulnerability to noise also increases.

For the purposes of this study, the channel is the material that is organized within a single hashtag. This presumes that those users deploying a hashtag are attempting to produce some kind of meaningful message with that hashtag. Users would be recirculating (and occasionally creating) tweets to lead to a vote for their particular candidate.

Retweeting may serve an adaptive role, as it would appear that an over-converged swarm would have lower entropy than a more personal message. Matthew Brashears and Eric Gladstone argue that error correction is a key capacity of social networks and that corrected errors increase the variety of messages diffusing across a network.¹⁶ Error correction and variation are features of real communication, not flaws. The need for error correction is mitigated by the diffusion of low-entropy messages. Decreased risk of error explains the seductive danger of over-convergence: if everyone tweets the same thing, the messenger is more likely to be lost, but that messenger is also less credible.¹⁷ The problems of extremely similar messages are well known in crisis and risk communication: over-convergence can swamp a good idea. Conversely, in the world of victim selection, the use of an utterly ridiculous, low-entropy classifier can quickly parse publics.¹⁸ Game and information metaphors unlock new possibilities for the study of the public sphere.

`Countdown to Election Day`

This data set includes 1,210,031 directed links scraped nightly via the Twitter application programming interface (API) with the text string “ImWithHer,” perhaps the most popular Clinton campaign–related hashtag during the week preceding the election. Initial analysis of this data set revealed that the fifth-highest degree among nodes was Rihanna’s endorsement of Senator Clinton. However, Rihanna’s original tweet was not within the time window of the study. This is a powerful example of information entropy: even seemingly friendly noise can overwhelm the signal.

Tweets were recorded using the twitteR package for R and analyzed using a combination of packages for R and Gephi.¹⁹ Users who are not major public figures will remain nameless. Celebrities tweeting endorsements of Senator Clinton would surely not presume that they would remain unknown. Of interest in this analysis are both the network structures among the tweeters and the semantic content of their tweets. The graphical representation of clusters and structures can be helpful for understanding the dynamics of the moment. Graphics presented here are drawn in Gephi, using a force-directed algorithm.²⁰ This is not an aesthetically pleasing strategy—but one that reveals structure of the @ network related to a particular hashtag. This was chosen, as it would minimize the number of assumptions in the analysis. The following are the analyses of the WikiLeaks and Satanism data sets.

`WikiTweets`

Initially, the network appears sparse, with minimal original content. Of the 1,210,031 tweets, 1,027,860 are retweets. The highest prestige node is WikiLeaks.²¹ WikiLeaks tweets are almost entirely identical. Of the 115,544 tweets about WikiLeaks in this data set, 113,346 are retweets. First, we should look at the WikiLeaks network for structure. The measure of kurtosis for the kcores calculated for both the full data set and then the restricted WikiLeaks data set is revealing: in the context of the entire network, the kurtosis of the distribution is 34.47 with a skewness of 4.19, while in the WikiLeaks distribution it is 189.65 with a skewness of 7.50.²² What does this mean?²³ When the data set is filtered for the network around a single story/flare, the relief of the kcores sharpens. The cone and magnet of the amplifier become visible. At the same time, the core detection method reveals not that a single community drove the WikiLeaks story—but that a few communities were far more involved in the story.

The pendanting in this data set is clear. When users with a single interaction are removed, the actual scaffolding of conversation is revealed. It is also important to see the core node of this plot—WikiLeaks itself. Consider the three graphics in figure 13.1. This is a ForceAtlas-directed graph of tweets using the word WikiLeaks. There are three graphics in the panel: the first is a full map of all interactions in the data set; the second requires five interactions to be visible, and the third requires fifteen. It also becomes clear that a few nodes were central to the entire network. If you cleave out either the center or the margins, the structure of users involved in ongoing conversation is scant.

Homogeneity is also a key feature in the texts distributed through this network. Of entries, 89 percent include the stem “RELEASE,” which is a key indicator of a retweet storm about the Podesta e-mails, and 98 percent are retweets. The absent center reveals the gravity well structuring the public sphere—the attempt to articulate Clinton to WikiLeaks. This is a relatively clear way to detect cascades directed at a particular node with the potential for jamming these interactions. Detecting and labeling these networks could allow for efficient advertising targeting or the deployment of countermeasures.

As we can see in figure 13.2, there are clearly two very large cores present; both are organized around the twitter handles of existing celebrities. On the left, we see the full network related to the Satanism accusation, then the nodes of the network are progressively filtered for degrees of two, three, and, finally, five. It should be clear that there is very little underlying organic conversation here. At the same time, we have established that there are thin relational networks created by fake news networks. These would allow core nodes in an ecosystem, such as those spreading the satanic tweet, to appear important and thus to appear more frequently than others. Even more than the WikiLeaks example, it is apparent here that there is not a sustained conversation about Satanism, but rather an organized effort to make it appear that Satanism is an issue, even as satirical use of the story would trigger a low-entropy messenger strategy—the story connects Clinton to WikiLeaks via Podesta.

The texts shared by the networks identified with WikiLeaks or Satanism are remarkably similar. Taking direction from an old, but important, source, Shannon’s mathematical theory of information, we can utilize the common use tables of information from the data set to produce a model of the relationship between the possible messages provided in a corpus and the possible effects of those messages.²⁴ Intermedia agenda setting thus allows the deployment of a dissimilar low-entropy message about the same issue to have the effect of making the issue real. Bots are not the real source of impact here: people choosing not to recognize the work of bots give the story life.

If we think of this list in categories, it would not be beyond the scope of possibility to construct an algorithm that would produce messages seemingly at random utilizing the contents of the frequency tables. It would appear that the vast majority of the WikiLeaks posts follow a template similar to this: RT @wikileaks: RELEASE: The Podesta Emails Part 24 #PodestaEmails #PodestaEmails24 #HillaryClinton #ImWithHer. Typically, a shortened link follows. The message here contains ideas: WikiLeaks, Podesta, and Clinton. There is no other semantic content to get in the way. The satanic example makes this clear as well, as the base tweet of the largest core presented in the graph relates to a single joke: “#ImWithHer—Satan.”²⁵

`The Fingerprint`

In the context of this study, we can see constellations of nodes and edges that form the prints of different networks. These networks may not be an easily read signature, but they are an important indicator. What has come across in this research is that there are swarms of users to interact with a key story or actor in a network once; these swarms tend to have very low levels of linguistic innovation.

The fingerprint of a fake news network would be the deployment of semantically similar content along a shallowly linked network. The goal of a fake news operation in a zero-sum game would be to disrupt the signal of democratic deliberation (the low goal) and replace the signal of the other side (the high goal). Even if the legacy media did not fall for the effort, agenda setting onto the question of Satanism was marginally effective in the effort to articulate WikiLeaks and Clinton, and the effort to inspire the image of corruption was successful. When considered through Warner’s counter-publicity, the fleeting referent is a part of the circulation of publicity and counter-publicity: tiny messages in fleeting moments have a profound impact. Echoes may resonate as a new voice. Communication research should not allow a preference for high-entropy strategies by sophisticated interest groups to replace the study of real publicity in actually existing games. Practitioners should take care that they address publics that exist and that they pay attention to the level of entropy in real debates.

Returning to information theory, we can begin to restage Shannon’s conjecture: that a machine employing a regular usage table for human writing could produce text reliably. In our context today, the elements of the prototypical tweet could become a tweet generator. The generator would assign direct addresses, deploy a short message (which could easily be created with frequency tables in an existent data set), followed by three hashtags, and a link to some other site. As a formula: Sock puppets (2) + Text (8) + Hashtags (3) + (1) Link = a low-entropy masterwork. It is not difficult to imagine dozens or more different political tweets produced by this algorithm. In the context of our fingerprints, and the agenda setting they drove, it would be enough to simply roll through a list of sock puppet accounts to produce the appearance of a network.

To build a machine that produces text would only require a list of Twitter account names, which would drive enough entropy and address structure to make a nearly identical message look at least somewhat authentic. When the referenced account names or hashtags are included, the noise sounds like a voice. Unlike a pure retweet, the bot-produced directed cascade has a more realistically networked appearance.

There are three distinct practical implications. First, in a zero-sum game, a noising operation could effectively introduce enough entropy into a channel to overwhelm it. Feeds have inherent scarcity. Second, the noise introduced would not be random, but be a strategic form designed to work at the lowest possible level of entropy. The counterargument introduced into a hashtag channel would only need a few words. Similarly, low-entropy messages, such as seen in the Satan data set, would inspire intermedia agenda setting, driven by their extremely low entropy and high affective potential. Beyond simply overwhelming a channel, a counterargument can be delivered as the noise. Third, swarms can be organized around realistic-looking webs of users to form a pseudo-event; when real users encounter these events, they may over-converge their messaging, enhancing the effect of the noise.

Prevention is critical. Individual users are not equipped to manage swarms of bots. Distinguishing between a swarm and a public is tricky, but necessary. The election itself may be a zero-sum game; social network firms should reimagine the feed to challenge this assumption. At the same time, as Warner’s counterpublics suggest, individual actors in the public sphere system could react to the swarm by retweeting their own information and building their own counterpublics. This is more likely with smaller swarms. Counter-publicity alone is not enough: affective resources are not evenly distributed; easily translated, highly affective meanings are likely reactionary. Messages we see circulating in this data set are simple: Hillary is Satan, WikiLeaks releases Clinton e-mails, and other three- and four-word wonders. This is not the stuff of technical public policy. Simple, direct, black-and-white alignments like “bad people should be punished,” “good people should be praised,” will beat shades of gray. Symmetrical or reciprocal plays are not available. Campaigns need swarms of candidate-specific, positive, low-entropy messaging to continuously flow through the feed. This strategy would be best after a significant change in platform-level policies such as cascade suppression and the removal of bot accounts.

Without platform-level change, it would not be difficult to imagine a future campaign where multiple fake news networks deploy increasingly low-entropy, highly converged messages across the platforms. These new low-entropy messages could become something of a new poetics. The question would become—what do we do with mass-mediated politics when the height of strategy is a two-word growl?

`Notes`

1. Susan Faludi, “How Hillary Clinton Met Satan,” New York Times, October 29, 2016, https://www.nytimes.com/2016/10/30/opinion/sunday/how-hillary-clinton-met-satan.html.
2. The specific accusation in the indictment is that “Hillary is a Satan,” which is interesting given the context. United States of America v. Internet Research Agency LLC et al. (District Court for the District of Columbia, February 16, 2018), https://www.justice.gov/file/1035477/download.
3. These are particularly well-known satire sites. ClickHole, https://resistancehole.clickhole.com/; The Borowitz Report, New Yorker, https://www.newyorker.com/humor/borowitz-report.
4. Gregor Aisch, Jon Huang, and Cecilia Kang, “Dissecting the #PizzaGate Conspiracy Theories,” Business Day, New York Times, December 10, 2016, https://www.nytimes.com/interactive/2016/12/10/business/media/pizzagate.html.
5. This timing did not work, as election law in France is designed to suppress news related to last-second updates or October surprises. “WikiLeaks Publishes Searchable Archive of Macron Campaign Emails,” Reuters, July 31, 2017, https://www.reuters.com/article/us-france-politics-wikileaks/wikileaks-publishes-searchable-archive-of-macron-campaign-emails-idUSKBN1AG1TZ.
6. Edson C. Tandoc Jr., Zheng Wei Lim, and Richard Ling, “Defining ‘Fake News’: A Typology of Scholarly Definitions,” Digital Journalism 6, no. 2 (2017): 137–153.
7. Hunt Allcott and Matthew Gentzkow, “Social Media and Fake News in the 2016 Election,” Journal of Economic Perspectives 31, no. 2 (2017): 211–236.
8. Soroush Vosoughi, Deb Roy, and Sinan Aral, “The Spread of True and False News Online,” Science 359, no. 6380 (2018): 1146–1151, http://science.sciencemag.org/content/359/6380/1146.
9. The Concord Catering Company was one of many names used by the Internet Research Agency in the United States. United States of America v. Internet Research Agency LLC et al.
10. This central theoretical construction for game theory has up to this point been woefully underemployed in communication research. In this context, I am referencing Ken Binmore’s description of the construct, as it both is adequately robust and provides the relevant context for the use of the ideas in this context. Ken Binmore, Game Theory: A Very Short Introduction (New York: Oxford University Press, 2007).
11. GOTV is effective; at the same time, it is important to note that these methods may increase inequality. The research in this area emphasizes the difficulty in producing meaningful mobilization. Ryan D. Enos, Anthony Fowler, and Lynn Vavreck, “Increasing Inequality: The Effect of GOTV Mobilization on the Composition of the Electorate,” Journal of Politics 76, no. 1 (2014): 273–288, https://doi.org/10.1017/S0022381613001308.
12. Michael, Warner, “Publics and Counterpublics,” Quarterly Journal of Speech 88, no. 4 (2002): 413–425.
13. Yu-Ru Lin, Brian Keegan, Drew Margolin, and David Lazer, “Rising Tides or Rising Stars? Dynamics of Shared Attention on Twitter during Media Events,” PLoS ONE 9, no. 5 (2014): e94093, https://doi.org/10.1371/journal.pone.0094093.
14. Yini Zhang, Chris Wells, Song Wang, and Karl Rohe, “Attention and Amplification in the Hybrid Media System: The Composition and Activity of Donald Trump’s Twitter Following during the 2016 Presidential Election,” New Media and Society 20, no. 9 (2018): 3161–3182.
15. Claude Shannon, “A Mathematical Theory of Communication,” Bell System Technical Journal 21 (1948): 379–423, 623–656.
16. Matthew Brashears and Eric Gladstone, “Error Correction Mechanisms in Social Networks Can Reduce Accuracy and Encourage Innovation,” Social Networks 44, no. 1 (2016): 22–35.
17. Kathryn E. Anthony, Timothy L. Sellnow, and Alyssa G. Millner, “Message Convergence as a Message-Centered Approach to Analyzing and Improving Risk Communication,” Journal of Applied Communication Research 41, no. 4 (2013): 346–364.
18. Cormac Herley, Why Do Nigerian Scammers Say They Are from Nigeria? (Redmond, WA: Microsoft Research, 2012), http://research.microsoft.com/pubs/167719/WhyFromNigeria.pdf.
19. The critical packages for this project include twitteR, for data collection, and dplyr and stringr, deployed through R studio. Hadley Wickham, Stringr, version 1.3.1 (R, 2018); Hadley Wickham et al., Dplyr, version 0.75 (R, 2018).
20. Mathieu Jacomy et al., ForceAtlas2 (Gephi, 2015).
21. Calculated via the Eigenvector mode, this is adequate but not ideal. There are other methods for calculating centrality that call for additional exploration in this context in particular.
22. The implementation of kcores used here is the base in the SNA package. This was done because it is a well-regarded package for network analysis, other studies have used this function, and the function could be implemented with this data set with the resources available to the researcher. Carter Butts, SNA, version 2.4 (R, 2016). Kurtosis and skewness were calculated using the moments package. Lukasz Komsta and Fredrick Novomentsky, Moments, version 0.14 (R, 2015).
23. There are important issues at play here. First, networks like these are obviously non-normal, meaning that any test that relies on a normal distribution would be inappropriate. There are other statistical methods that could be applied to determine if the change was significant. A qual-quant mix could also be a profitable strategy. It is important to note that both Wiki and Satan are relatively small subsets and that a few cores have dramatically larger representation in the data set. Further, these cores could be effectively segmented repeatedly to reveal the structure within the structure. This is likely an important future method for increasing the resolution of a network scan for bots.
24. Shannon, “A Mathematical Theory of Communication.”
25. This is a common joke that could be attributed to the data set as a whole.