1
THE PROBLEM
A JOURNALISTIC MODEL IN TRANSITION
While not long ago the scenario that follows might have seemed like a fantasy, one could easily imagine it happening in a few years, thanks to advancements in AI.
It’s the near future. When the Newsmaker wakes up, so do all the smart devices in her home. One device starts playing the Newsmaker’s daily news digest, timed precisely for the moment she has left the bed and taken a few steps. It knows her brain isn’t ready for full-length news stories right when she wakes up.
Meanwhile, her smart house begins preparing for the day. An algorithmic coffee maker brews a cup personalized to the Newsmaker’s tastes. Her smartphone prepares a preview of the day’s calendar, and before she has left the house she has already been updated on the relevant news affecting her beat.
Outside, the Newsmaker hops into her self-driving car. On the ride to work, a podcast playlist has been algorithmically customized based on her consumption patterns.
A few miles from work, the car’s sensors detect a 10 percent reduction in air quality. This intrigues the Newsmaker. As she nears the newsroom, an algorithm that tracks social media notifies her that there’s increased chatter online about air pollution and children suffering from asthma attacks in the area.
Once she’s inside the office, she turns to her AI-powered software to access targeted data: a network of drones measure air quality and take aerial photos. She gets the following automated assessment:
There has been a decrease in visibility—a potential indicator of high air pollution—within 5 miles of the textiles factories in the past 10 days.
Meanwhile, the Newsmaker commands an AI software to assess historical datasets looking for correlations, some of which the Newsmaker hasn’t even asked for. An analysis of the National Institute of Environmental Health Sciences database highlights that pollution rates in the region are abnormally high when compared to historical trends.
At the same time, another AI is detecting clusters of conversations on social media from parents concerned about the health of their children. The system quickly summarizes trends in the discussion.
The Newsmaker decides to explore these findings using qualitative interviews with local sources. Some of her sources are people she has spoken to before, while others the AI suggested and located, including the manager of a local textile factory.
Later, an AI automatically transcribes the audio for all her interviews, saving the Newsmaker hours of manual work. She also wants to be sure she understood her sources correctly, so she deploys her AI to evaluate the consistency of all statements. Everything matches up.
After gathering qualitative and quantitative data, the Newsmaker is almost ready to publish a story. She directs another AI to generate a first draft that aggregates the data and summarizes previous news reports on air pollution in the area. The program teases out the story for the Newsmaker; she reviews it and makes a few edits. After the article is reviewed by an automation editor who understands the nature of how this story has been created, the piece is published across various platforms. Headline: Fairview Parents Concerned about Health Damage from Air Pollution.
This initial story gets substantial traction and many of her readers demand more information about the situation. Analyzing the article’s comments section, the relevant AI identifies a number of concerned citizens discussing whether there was negligence from the local factory owners. That’s a good angle for a story, she thinks.
The new model of journalism is most effective when it integrates real-time feedback from audiences. But this means reporters must adapt their workflow to a more engaged and dynamic form of storytelling. In this scenario, AI was used to find relevant insights and most importantly to test whether the topic is of interest to readers before investing too many resources into a full journalistic investigation. This is iterative journalism.
Machines will not replace the most important roles of the journalist. Instead, they will fuel reporters, providing more opportunities to go deeper into stories and actively connect with readers. While the transition won’t be easy, the personal contributions of the journalist will remain central to the process.
Welcome to the new world of the Newsmaker.1 This is a world of human-machine collaboration, where data is leveraged as raw material, as distinct from content. Where it can be collected from sensors, mined from news archives, and analyzed by algorithms to extract insights.
It’s a world that open-sources information gathering and treats the news consumer’s feedback as an integral part of the process. A world in which newsmakers have the freedom to be more creative, by adapting to new kinds of information dissemination. Whether they will take advantage of their newfound freedom is another question.
1.1. THE OLD JOURNALISM MODEL
Throughout its history, journalism has operated linearly— reporters conceived a story idea through laborious research and data analysis, cultivated and interviewed sources, packaged all of this information into a draft, then worked with editors to finalize their one-off articles (with limited distribution across platforms). Eventually, much later, the story is published. Only after publication did readers come into contact with that piece of journalism.
image
FIGURE 1.1: The traditional journalism model is linear, rigid, and structured.
For the Newsmaker, the procedural inefficiency of this process was only the beginning of the problem.
Historically, the Newsmaker was unable to get audience feedback for stories that were still being written and refined. She was only scratching the surface of the data she had available; looking for complex correlations within it wasn’t feasible for a person with tight deadlines.
In the context of modern media consumption, the limitations of this traditional model have become painfully obvious. Its rigidity prevents journalists from identifying relevant new perspectives in a story because there’s little opportunity to consider ideas before the piece is published.
We can segment the Newsmaker’s core activities in this traditional model into three processes: gathering information, packaging that information into a story, and, finally, distributing the story through her newspaper, in print or online. Each of these processes could benefit from AI.
NEWSGATHERING WAS SLOW AND LIMITED BY MANUAL PROCESSES
In the traditional model of researching and writing a story, the Newsmaker used her own archives, publicly available data, and reliable sources she has called on over the years— the city clerk, key community leaders, the president of the county’s industrial association, and so on.
The Newsmaker had a list of sources and stuck to them. Actually, she was stuck with them. They were her network, and although it was extensive, in many cases, it never felt big or diverse enough.
She conducted interviews in person, by phone, or by email, took notes on a laptop, manually transcribed audio recordings, and drafted stories using a word processor. For one story previously described, she sifted through thousands of data points and reports about factory pollution from the state’s environmental agency. This alone took her hours. Sometimes she found herself thinking that she didn’t graduate with a journalism degree just to transcribe phone calls.
The entire process of news collection was slow, manual, and heavily reliant on institutional knowledge developed by the editorial staff. Institutional knowledge is not a bad thing, as long as it doesn’t become institutionalized past the point of no return. In this case, technology, external data, and audience feedback that could have made reporting more efficient were ignored simply because the Newsmaker was pressed for time, due to the structural inefficiencies of her newsroom.
Before integrating AI and other new technologies into her own work, the Newsmaker assesses the risks of relying strictly on traditional methods for data collection. She knows that if she had more data, her stories would have included additional context, but she also knows that her time is limited. Reading user comments on the website (some of which are . . . not so nice), she realizes that her reporting could have explored other angles that would be relevant to her audiences.
How distinct would her stories have been if real-time audience feedback and sophisticated AI-powered software were available just a few years ago?
A story she wrote on college campus safety—which took two months of research and writing—would have benefited from real-time updates on the increased number of complaints from universities around the state. If she had developed that story using emerging tools that scrape publicly accessible databases and documents, she could have incorporated constant input on how these safety incidents continued to affect the schools.
This data might have come from social listening tools, algorithms that track conversation shifts on Twitter, Facebook, and online forums, or public reports sent out by universities themselves.
At the same time, this sensitive data would not have gone directly to publication before the Newsmaker reviewed it. This is why it’s not AI alone that enables rich reporting, but the combination of journalistic intuition and machine intelligence.
image
FIGURE 1.2: Human machine collaboration; combining people's intuition with computer's intelligence
The future of newsrooms depends on investments in both human and technological capital. A survey conducted in 2018 by the Reuters Institute for the Study of Journalism showed that 78 percent of respondents believe that investment in AI is needed, while 85 percent believed that human journalists will help newsrooms navigate future challenges.2
In the new model of human-machine collaboration, storytelling becomes dynamic. In the Newsmaker’s campus safety story, as the number of reported incidents rose, the content could have been automatically refreshed to convey the more dire situation. With AI, the story would have been written once but contextualized with new data on an ongoing basis—under the supervision of an editor.
Under the traditional model, the Newsmaker and her colleagues all tended to approach a story in a similar way. They scoured the data to find anomalies and then cross-referenced those anomalies with interviews. What they were missing is that anomalies in data or conflicting source perspectives are not always meaningful. Maybe, more importantly, the contrapositive is also true: just because a data point is not an anomaly doesn’t make it insignificant. This is where machines are now helping the Newsmaker.
In a 2017 investigation into spy planes, BuzzFeed News trained a machine-learning model to look for aircrafts with flight patterns similar to those operated by the Department of Homeland Security and the FBI.3 The system, which was trained on data from twenty thousand planes, looked for attributes such as flight speed, altitude, and duration. Although the algorithm allowed the reporter to accurately identify new surveillance airplanes operated by law enforcement agencies, it also produced some inaccurate results. In some instances, the AI thought skydiving operations were spy planes, because of similarities in their flight patterns—navigating in a small area for a given period of time. The machine made a mistake that was caught by thorough human oversight.
As described above, leveraging new methods of data analysis, story sourcing, and machine-driven audience analytics can lead news organizations to new coverage topics, add a greater level of context to their reporting, and open a channel of transparency and conversation with their audience.
For example, the Financial Times used AI to develop “She Said He Said,” a bot that automatically tracks whether a source quoted in a written story is male or female.4 The system works by using text analysis algorithms that track pronouns and first names to determine the gender of people mentioned in any given article. As reporters write their piece, the bot will alert them of any imbalance in gender ratios. A previous project, JanetBot, leveraged computer vision to identify men and women in the images on the Financial Times home page.5 According to FT these efforts are based on “research showing a positive correlation between stories including quotes of women and higher rates of engagement with female readers.”
At the same time, choosing not to leverage external data or audience feedback is detrimental to news organizations, as it runs the risk of creating an echo chamber that precludes opportunities to uncover new story subjects and angles or diversify the audience.
THE TRADITIONAL PROCESS IS A ONE-SIZE-FITS-ALL SOLUTION
For the story on college campus safety, the Newsmaker only had time to develop one angle, for one audience—the same ambiguously defined audience she has always written for. Like many journalists have done for years, the Newsmaker was trained to write for an imagined audience, one based on limited knowledge of who her readers actually were.
An article entitled “Fairview University On-Campus Safety Incidents Force Chancellor to Resign,” for example, would appear in the same form to a twenty-year-old woman who attends Fairview University as it does to the middle-aged father of a high school senior in the neighboring town of Franklin whose child is applying to college. The content was one-size-fits-all. And why wouldn’t it be? The Newsmaker wrote stories in the hopes of reaching as many people as possible. She couldn’t write five different stories, tailored to five different readers or platforms—she didn’t have the time or the resources.
image
FIGURE 1.3: The traditional novel relies on a “one-size-fits-all” approach to journalism where different readers are exposed to the same piece of content.
In this traditional model, one-size-fits-all makes sense. A good portion of what journalists produce is a result of time-consuming processes, from writing to video to photography, all generated without the help of smart technology. In this model, human input is the dominant driver of the process.
But this approach is misaligned with the current media landscape. Oversupplied with content and undersupplied with technology, publishers must learn to respect the value of differentiation. The old model of media, which was once adequate, is becoming less so because people’s expectations and options for information sources are now widened. Human journalists alone cannot support the growing demand for personalization.
After being reviewed by an editor, the Newsmaker’s campus safety story was published. In the old model, circulation was limited to distribution channels where the newsroom had full oversight: the weekly print paper, the website, a mobile app, and, in some cases, links on social media in hopes of driving traffic back to the home page.
This scenario illustrates just how linear the traditional model of journalism is—gathering information, creating a story for a specific use, and distributing it as a final product. But as new technologies enter the bloodstream of society, they can accelerate how newsrooms operate, setting the stage for a new model of journalism.
1.2. THE NEW JOURNALISM MODEL
Back in the present (and the future), the Newsmaker finally has the tools she needs to reimagine what storytelling can look like.
image
FIGURE 1.4: The modern journalism workflow is dynamic, with each step of the process being augmented by AI.
With AI-powered tools readily available, the new paradigm of journalism breaks away from the linear succession of “gather, package, and distribute” for each story. It atomizes each step and augments it with new technology.
NEWSGATHERING TAKES SHAPE OUTSIDE THE NEWSROOM WALLS
At a recent industry conference focused on journalism innovation, the Newsmaker learned about a number of new approaches that allow journalists to source and verify content created by anyone, regardless of topic, format, or location.
She became particularly fascinated with how user-generated content, drone footage, and data from sensors can provide intelligence from the field, oftentimes from areas reporters can’t access. Even without any background in artificial intelligence, the Newsmaker was growing more confident that she could integrate these new content sources into her own work. To her, the sources were complementary, not in competition.
Taking the long view, the Newsmaker recalled how the telephone had become an important newsgathering tool by the mid-1930s.6 It allowed reporters to contact sources more quickly and increased their reach far beyond the previous range. The same argument could be made for artificial intelligence today: it’s just another tool in the editorial Swiss Army knife.
For example, using the AI-powered platform News Tracer, Reuters has been able to sift through emerging topics on social media to determine if they are newsworthy and truthful, which helps reporters monitor events and find relevant stories more quickly. The tool has been particularly relevant for breaking news situations. In 2015, it revealed social media activity documenting a shooting in San Bernardino, California, before any other news organization. The next year, News Tracer warned its journalists of an earthquake in Ecuador eighteen minutes before other publishers broke the story.7
EYEWITNESSES ARE EVERYWHERE
The rise of social media platforms has given birth to an ecosystem where text posts, photos, and video provide an eyewitness perspective on major breaking news events big and small—from terrorist attacks to parades and the local high school football team victory. This happens at an unprecedented scale and with unprecedented speed.
By mining feeds from social media and other public sources, AI also enables the Newsmaker to uncover new perspectives for existing stories. It allows journalists to use the entire digital sphere as one giant source of structured information.
For example, Spanish newspaper El País leveraged data mining tool Graphext to map relationships between politicians and the media, by analyzing hundreds of social media accounts.8
Like Reuters, public broadcaster Radio France collaborated with a technology company, Dataminr, to leverage an AI-powered tool to detect outliers in social media conversations. This gave the French news organization a head start covering the 2016 bombing of the Brussels airport and the terrorist attack in Nice, because social media functioned like an early warning system for uncovering emerging narrative threads.9 This type of early detection gave its journalists more time to plan and respond to breaking news stories.
The Newsmaker can now use updates published by citizens on social media to measure public discourse, including people’s sentiments about the problems at the Fairview factories described above. Reviewing social media, public records, news archives and forums, and other sources faster than the Newsmaker can blink, AI brings a new perspective to journalism.
A recent alert sent to the Newsmaker from one of these systems read:
Mothers in East Fairview are referencing opioid death with abnormal frequency—up 250% since last week.
This type of AI tool works by clustering together people who share similar demographic or psychographic profiles and then running semantic analysis (the meaning behind language) on the updates they post online. This approach is a process by which machines can group similar content together by analyzing patterns in their words.
This new approach can provide publishers with news emerging directly from citizens’ concerns via social media feeds and help journalists respond to the issues and events that are affecting their audiences’ lives at any given moment.
However, as citizens’ concerns over data privacy deepen we will experience the growth of more private, closed social networks like WhatsApp, Signal, and WeChat, which will impact this type of monitoring of social media by journalists. In fact, The Digital News Report 2018 found that private messaging was growing as a news source, with the portion of people using WhatsApp for news doubling in the past four years, to 16 percent.10 A Pew Research Center study found that 44 percent of young people aged eighteen to twenty-nine have deleted the Facebook app from their phone and 64 percent have adjusted their privacy settings in the last year.11 These trends suggest that crowdsourced reporting will have to continually adapt, in particular to a landscape where discussions of the news take place through private mediums.
One new AI-powered tool used by the Newsmaker can monitor all media outlets (whether via social media, RSS feeds, web scraping, or some other source) to determine who’s covering what topics, what’s being investigated, when it happened, and where it takes place:
UK prime minister mentions by conservative news outlets are up 5% today, compared to its weekly average. The news articles discuss concepts related to budget and healthcare reform.
To put it simply, this feed provides the Newsmaker and her colleagues with news about the news. Inside the AI, various computer programs track major media outlets while other programs analyze their content. The AI, which has been trained on many news articles, is able to read and identify topics, sentiment, and other useful features. All this, which would take the Newsmakers hours to read through, the AI can sort and surface in an efficient manner.
With it, the Newsmaker and her colleagues know who is covering what and, most importantly, how, which in turn allows them to better differentiate their own reporting to reach the widest possible audience with the strongest, most distinctive angle.
The technologies described above are not science fiction. The Wall Street Journal mines news archives from Factiva, a database with over thirty-three thousand sources, to monitor the ever-changing landscape of regulation and financial crime. A tool called Media Cloud, developed by researchers at MIT’s Center for Civic Media and used by emerging news organizations like Vocativ and FiveThirtyEight, found that during the 2013–2016 Ebola outbreak, there was more public engagement with stories about infections in the United States than with those about infections in West Africa, where the disease was most deadly, which highlighted the challenges facing global health organizations attempting to direct the narrative about the Ebola outbreak.12
There have never been enough reporters in the newsroom, not to mention enough time in the day, to research every possible news story or possible angle. In the traditional model, the Newsmaker had a tendency to get stuck in the same news cycle (bad weather, political scandal, company going out of business, local hero, and so on), but machine learning tools can help create a more diversified news agenda and sourcing practices by holding a mirror up to the publication’s own coverage. Any data point (from social media to public records and official documents, from press releases to news archives) can be used to further the journalistic mission.
In collaboration with the Laboratory for Social Machines at the MIT Media Lab, Vice News developed a story demonstrating the political divide on Twitter leading up to the 2016 U.S. presidential election.13 The MIT researchers built a series of classifiers (smart filters) to categorize Twitter users by their political ideology and location, using a type of artificial intelligence called supervised learning. Through this approach, it was possible to understand the structure and dynamics of users’ relationships across the political divide. Vice News was able to surface that Donald Trump supporters formed “a particular insular group when talking about politics,” while Hillary Clinton proponents were far less cohesive as a group. While the data did not predict the outcome of the election, it provided journalists an unprecedented analytical insight into how information bubbles might have contributed to the polarization of public discourse during a crucial period in our recent history.
The complexity of these data results shows that no matter how sophisticated such systems become, humans will still have a central role in defining best practices for the machines and interpreting results. It’s the Newsmaker’s responsibility to understand how the algorithms make causal links between data sets, to recognize when a source may prove valuable, and to know when to push forward with or step back from a story.
EMERGING FORMS OF NEWSGATHERING
The Newsmaker doesn’t only draw on existing human sources. She is also testing how to collect additional information through new techniques. Smart devices, like the sensors in her car or data beacons used to track movement, can also be used to provide more context to a story.
QUESTIONS TO ASK WHEN APPLYING AI IN THE NEWSROOM
•    CHALLENGE: What challenges are you trying to solve?
•    PROCESS: How can you translate this challenge into smaller steps?
•    DATA: Do you have the right data to solve the challenge?
•    RESEARCH: Where is the data coming from and how is it going to be vetted?
•    PITFALLS: What errors can be expected from the algorithm and how can you add editorial oversight?
Smart sensors can offer her data on traffic, weather, population density, or power consumption. With other, similar smart devices, the Newsmaker can monitor vibration and noise from entertainment and political events to identify the most popular songs at a concert, the biggest plays of a game, or the quotes that resonate most with people attending campaign rallies. Or she could monitor vibrations of construction sites to measure the impact on nearby residents and businesses, or track foot traffic at new public transportation stops, to gauge their usage.
The South Florida Sun Sentinel collected data through GPS sensors to investigate speeding police officers, leading to a series awarded a Pulitzer Prize for Public Service in 2013.14 The public radio program Radiolab leveraged temperature sensors to predict the arrival of cicadas as well as to evaluate the impact of heat stress in the neighborhood of Harlem in New York City.15
Some news organizations are even experimenting with AI-powered sensors. In partnership with NYU’s Studio 20 journalism program, researcher Stephanie Ho developed a prototype with the Associated Press of a sensor-powered camera for reporters and photographers working at large-scale public events.16 The sensor would monitor the space for triggers, like noise, and when the triggers reached a certain threshold the sensor would take a photograph and email it to the reporter.
These developments are exciting to journalists like the Newsmaker. But many newsrooms find them threatening, seeing in them the demise of the profession. A more nuanced perspective is that this technological evolution does not replace the traditional approach of researching stories. It actually increases newsrooms’ access to data and insights.
Meanwhile, with the help of universities or through collaborations with technology companies, experimenting with AI is becoming more accessible. These partnerships can be established by hosting university research fellows in the newsroom and by establishing capstone courses with journalism schools. Another effective approach for underfinanced news organizations looking to innovate in a cost-effective manner is to seek grants from foundations. For instance, the Seattle Times received financing from the Knight Foundation’s AI and the News: Open Challenge to develop a reporting project evaluating the implications of machine learning on work and labor.17
Partnering with a university researcher, the Newsmaker employs an AI system to investigate whether a $40 million investment in the train station in Fairview was a good use of public funds. The city’s transportation commissioner recently deemed the project a resounding “success,” claiming that the station’s average daily use rate is 3,000 people. By installing sophisticated AI-powered sensors that monitor the number of people who enter and leave the station, and employing an AI-enabled computer to detect and analyze images for certain objects, such as people, the Newsmaker is able to determine that the real usage rate is, in fact, closer to 1,500—half the figure the public official announced.
That data point nudges the Newsmaker to investigate further, by requesting public records on the volume of ticket sales and interviewing workers at the station. The resulting journalistic work is published with the headline, “Transportation Commissioner Inflates Passenger Estimate.”
In this instance, using AI-powered technology helped the Newsmaker keep a public official accountable.
The New York Times also deployed this kind of technology, to illustrate the power and perils of image recognition.18 Using public video footage of Bryant Park in New York City and analyzing it through Amazon’s facial recognition software (which is commercially available), journalists were able to identify thousands of faces of people who were caught on camera walking by the park. The result is an interactive article discussing the broader implications of this type of technology and its potential uses by governments. Most surprisingly, the story features interviews with some of those individuals who were initially tracked by the New York Times’s AI system.
Similar technologies applied by newsmakers are also helping their newsrooms become more efficient. Instead of spending valuable work hours transcribing interviews and manually inputting datasets, a reporter’s daily duties could be focused on making important calls and pursuing leads derived from AI insights. The reporter should be the reporter, not the assistant and the reporter.
Using AI, the Newsmaker also has the tools and computing power she needs to recognize causal links or correlations within data that she would not have noticed on her own. But even though AI flags these points of connection, it is still up to her to verify and unpack relationships within the data.
SANITY CHECKS JOURNALISTS CAN FOLLOW TO EVALUATE AI-DRIVEN RESULTS
•    CONSISTENCY: Ensure that the output is plausible and aligns with an initial understanding of the data. This means confirming that the level of magnitude of a certain result is appropriate (e.g., thousands of people vs. hundreds of people).
•    REPLICABILITY: Make sure it is possible to reproduce the output if the results are probed by editors. Journalists should keep records of the data used, methodology, and final output.
•    VERIFY: Have a colleague check and cross-reference final calculations. It is important to document the entire process, so that it can easily be explained to other journalists how a certain algorithmic result in a story was obtained.
Perhaps the AI detected only 1,500 people entering and leaving the train station at Fairview, but its motion sensors and image recognition software did not identify children in strollers. This is where human auditing is not only crucial for AI-driven systems—it’s irreplaceable.
As AI technologies become prevalent, explainability becomes a high-demand feature when the public seeks to understand the technologies that govern their lives. Because algorithms are often written as black boxes, where only inputs and outputs are seen by their users, journalists must fill a role in explaining and examining these technologies. Therefore, even when AI is applied in an investigation, a human reporter is still crucial to the journalistic process.
DISTRIBUTING THE NEWS: NEWSROOMS ARE NOT BOUND BY A SINGLE OUTPUT TO TELL THE STORY
The Newsmaker notices that consumers are coming to the news from a variety of perspectives and platforms, with 68 percent of U.S. adults getting news from social media sites, according to a 2018 Pew Research Center study.19 Her newly AI-powered newsroom can now provide multiple story angles that suit those distinct consumers.
Even more importantly, journalists can work with AI to reimagine news as dynamic, rather than static. Historically, the news relationship has been one-way, built around the terms and timelines of the publisher, and between the news organization and a single perceived audience.
Modern media consumers are looking for immediate value in terms of information and analysis; if they don’t find it in one place, rather than invest in further interaction with the content, they will head elsewhere to learn about their world. Given that virtually all media providers are on the internet, this means that there’s one unified “attention arena,” with everyone is competing in the same environment. This wasn’t the case two decades ago, when each medium had a distinct distribution channel: people watched programs on TV, listened to broadcasts on the radio, and read the news in the paper. Traditional news organizations are fighting for audiences and users’ attention with all other sources of information, not just other journalistic organizations. And that means that news publishers need not only to differentiate themselves from one another but also to differentiate themselves as an industry, to make their content competitive with other media on the internet, including games, books, and movies.
The Newsmaker has been exploring new topics to cover and unique programming that can attract new audiences across different segments. She recently used a video automation tool to generate content in niche topics such as climate change, marijuana legalization, space exploration, and other topics her younger readers are particularly interested in.
Looking beyond AI, one strategy for setting publishers apart is to expand into emerging platforms that are not currently primary sources for news. For example, the Washington Post launched its “Playing Games with Politicians” series on Twitch (a popular platform used to stream video games), allowing users to follow along as politicians are interviewed while playing a video game.20 Meanwhile, video game journalism is an emerging field where reporters work with video game designers to make the narratives and experiences the reporters cover into interactive applications. American Public Media’s Budget Hero game, which allows users to attempt to balance the federal budget, is one example of innovative reporting using an emerging media form that combines information with entertainment.21
New formats in digital storytelling, many powered by AI, have grown exponentially as competition for audiences intensifies. The dominance of social media and mobile platforms, alongside algorithms that can create new versions of a story, has prompted a fundamental shift in storytelling structure.22 Emerging and newly popularized entry points for news include imagery, timelines, social media cards, short video, virtual and augmented reality, newsletters, bots, data visualizations, listicles, long-form interactive stories, voice-enabled news, explorable explainers, alerts and notifications, and more. These new approaches do not replace traditional journalistic storytelling. In fact, they enable journalists to provide their audiences with more value and more access points to interact with information.
In 2016, news organizations including the Associated Press and Reuters leveraged AI-powered platform Graphiq to automatically generate data visualizations and insert them directly into articles to provide readers with additional context. This AI works by understanding the nature of the concepts in a story and pairing them with a relevant data visual. In some instances, publishers using this system registered a 40 percent increase in reader time on-site.23 The resources and training to help manage emerging production processes and unlock all the advantage from using these AI tools.
image
FIGURE 1.5: Newsrooms must be able to produce multiple formats and distribute content across different platforms. Artificial intelligence enables journalists to do this at scale.
growth in new forms of storytelling may also require more
Given the industry-wide resource problems, how do journalists cultivate dynamic storytelling in light of time and economic constraints? The Newsmaker has discovered that one solution is the automatic creation of different versions of the same story.
By leveraging summarization technology, the Newsmaker can automatically turn a long article into a mobile-friendly post. This process relies on a type of AI called natural language, a class of algorithms that helps computers interpret and manipulate human language. In the case of AI-driven summaries, it works by ranking the relevance of phrases and automatically selecting the passages that convey the most critical information from the original news article.
The newsmaker can also effectively use AI to turn sports data into hundreds of text stories at scale, and even from different perspectives—say, from both the winning and the losing teams. This applies not only to the headline, but also to the story itself. In this case, the algorithms are helping the Newsmaker produce different versions of the same stories, something that, if done manually, would have been incredibly time-consuming.
Barcelona FC Knocked Out Again by Real Madrid
vs.
Real Madrid Continues Winning
Streak Against Barcelona FC
The AI-powered news agency Narrativa, for example, is able to create 18,000 distinct soccer news articles for different leagues and teams, every week, in English, Spanish, and Arabic. These stories are then published by news portals such as MSN.com and El Confidencial.
Exploring this type of capability even leads the Newsmaker to create different story versions based on world region:
In London today, the prime minister announced . . .
In England’s capital, London, the prime . . .
In a briefing in front of Downing Street . . .
(NOTE: We will explain in detail how all of these technologies work in chapter 2.)
image
FIGURE 1.6: Artificial intelligence enables to personalize and localize content to individual news consumers.
While localization and personalization drive higher consumption, they can also create distortions of the public sphere if implemented without editorial guidelines and well-defined journalistic standards. For example, which of the following headlines do you perceive as being more critical? Which do you think a supporter of the opposing party is more likely to click?
HEADLINE 1: The Progressive Party Pushes a Bill Demanding Increased Financial Regulation
HEADLINE 2: The Progressive Party Proposes a Bill Urging Further Financial Regulation
Generating paraphrased text is still an emerging field in natural language processing. Basic techniques include using grammatical rules and thesaurus entries to replace words in a sentence. More complex methods use AI models that are trained to translate longer passages to shorter passages by learning patterns in word sequences for full and paraphrased texts. A similar approach can be used to tailor content according to its readers, varying personality, tone, location, time of day, and more.
One pitfall to note is that news consumers tend to seek content that confirms their preexisting beliefs—a phenomenon known as confirmation bias—which can lead them to share only their particular viewpoint on social media and contribute to a more polarized online discourse.24 Newsrooms personalizing content from different perspectives should be cautious not to use these tools to feed more divisive consumption.
As long as journalists heed these precautions, newsrooms can leverage AI to advance their journalistic mission by extracting data from archives, mining it for insights, and even automatically personalizing it before distribution. Algorithms developed with artificial intelligence can convert data into stories and customize them to serve specific audience needs based on real-time feedback. Smart machines can make the process faster and more efficient. We live in a data-driven world. We always have. The only difference now is that we have the tools to measure, interpret, and process the data, and the time to develop deeper perspectives.
Automation allows news organizations to distribute higher volumes of content at lower costs, and also to produce entirely new content that would have otherwise been too expensive to create.
Needless to say, the promise of higher-quality outcomes without cost-cutting tactics is not new, nor is it necessarily unique to AI. During the industrial revolution, business magnates promised that machines would not replace human jobs, only alter them, while increasing the average quality of life exponentially. Machines would be doing the low-level jobs, leaving more complex and interesting work to the humans. But some companies used the efficiency of the new machines to reduce operational costs rather than reinvest in higher-level work. This is also a legitimate concern with regard to AI.
Encouragingly, research suggests that jobs that involve creativity, ideation, and empathy are those least likely to be automated. In a 2013 study, researchers at Oxford University found that the likelihood of computers taking over journalists’ and newspaper editors’ jobs was 8 percent, while for reporters and correspondents it was 11 percent.25 Meanwhile, the likelihood that a bank clerk’s position will be automated is 96.8 percent; for a sports coach, 38.3 percent.
A more proactive and strategic approach to this issue is exemplified by organizations such as the Associated Press, which since 2014 has used the savings it gained from automated financial stories (an estimated 20 percent of journalists’ time saved26) to train reporters in immersive media and digital storytelling. Not only was there no job loss; there were new jobs created, such as that of automation editor.
EXPLORING NEW MODELS AND POINTS OF DISTRIBUTION
Since the advent of the internet, publishers have been trying to leverage distribution channels—such as social media networks—to drive traffic to their websites. Now content can be hosted, distributed, and monetized on these third-party platforms through services including Facebook’s Instant Articles and Google’s subscription tool for publishers. For example, an analysis conducted in 2016 by AI platform Naytev showed that BuzzFeed used forty-five different distribution channels, including messaging apps and image- and video-sharing platforms.27 A staggering 80 percent of the publisher’s reach existed beyond their website.
The Newsmaker has seen the emergence of a new wave of media companies such as NowThis that emphasize syndicating their content through third-party platforms. Is this the right approach? What about maintaining control of content and its standards?
When content is distributed beyond the publisher’s properties there is a real risk that some articles will show up next to content that dramatically impacts the credibility of sound reporting. For example, if a story about an election outcome shows up in a newsfeed next to a fake article with conflicting views, this may confuse the reader as to what source to trust.
This risk emerges at the juncture of technology, editorial standards, and strategy. Also on the line is a dramatic change in revenue generation. When considering content syndication, publishers should safeguard their brand from dilution by distributing content to select partners only. However, diversification is equally important, as reliance on a single third-party platform might hinder long-term growth of both audiences and revenue. Most importantly, it’s crucial to ensure editorial control over the overall news experience, especially when consumption happens outside the publisher’s digital properties.
For most of the twentieth century and at the beginning of the twenty-first, news media companies (whether print, broadcast, or online), generated revenue primarily through subscriptions or other recurring fees and advertising, and the amount of money earned corresponded, at least indirectly, to audience size. With the advent of the internet, earlier this century, publishers increasingly turned to external platforms to build their brands.
Now publishers no longer simply aim to acquire traffic through search and social. They are also syndicating through third-party platforms such as Facebook, Twitter, YouTube, Snapchat, and more. Traffic acquisition strategies have the primary goal of driving readers back to the publisher’s own properties, while approaches to content syndication focus on engaging audiences outside the publisher’s site and monetizing them through revenue share agreements with third-party platforms.
image
FIGURE 1.7: Traffic acquisition vs. content syndication to third-party platforms.
By talking with industry peers, the Newsmaker identified a few ideas to explore when considering syndicating content.
Not only does this new distribution strategy require new business models; it also requires new ways of thinking on the editorial side. Newsrooms are becoming responsible for multiple platforms, and editors are becoming more than simply journalists—they are now “information officers,” who must adapt to different platforms while keeping an eye on the original scope of the story and the framework in which information is gathered.
Media organizations including Reuters, the Chicago Tribune, Hearst, and CBS Interactive deploy AI-powered content distribution platform TrueAnthem to determine what stories should be recirculated and when they should be posted across social media platforms.28 To make these decisions, the system tracks signals that predict performance, including the level of audience engagement, publishing frequency, and time of the day. The platform also automatically generates copy for posts using the tone and voice of the publication by indexing content and extracting descriptive metadata from the articles.
1.3. A NEW MODEL REQUIRES A NEW WAY OF WORKING
Over the years, the Newsmaker’s organization has found its budgets shrinking in the face of declining advertising and subscription revenues. The reality is that news organizations are now competing in an oversupplied news market that demands journalists create more with less.
In the midst of all of this rapid change, an email arrives in the Newsmaker’s inbox from her editor in chief. It’s entitled “The way forward.”
Dear colleagues,
In a period of disruption, the best way to anticipate change is to invest in internal capabilities and promote new thinking.
It’s crucial to bring everyone into the process of experimentation rather than establishing an independent innovation unit. This organic process starts with “agents of change” within the newsroom.
As such, we are seeking five colleagues to go through a training program focused on research and experimentation best practices.
Participants will then be responsible for bringing that knowledge back to their departments and establishing a culture that encourages new ideas as well as problem solving.
- Editor-in-Chief
The Newsmaker’s editor is right. When “innovation agents” are concentrated in a single department, problems will crop up:
•    Little or no open communication with other groups in the editorial department or with the product and technology teams
•    Too much focus on experimentation, with no real direction or alignment with the overall strategy of the newsroom
•    Isolation from important conversations happening elsewhere in the newsroom
This all results in “innovation” projects with limited impact.
LEVERAGING AI IN THE NEWSROOM REQUIRES A NEW PROCESS
Newsmakers throughout the industry are experimenting with and deploying artificial intelligence to alleviate the current straits, but to succeed the change must be organic. Newsroom transformation is not about technology; it’s about cultural change. This starts by fostering an environment where journalists are encouraged to pilot, to fail, to get feedback, to iterate. AI accelerates the process of collecting and contextualizing data, which is integral to the overall journalistic process. Deploying these capabilities requires a new way of working that:
•    Emphasizes experimentation, including making data-driven decisions to develop new content and build new products
•    Fosters collaboration, where the editorial and technology staffs work together to identify new opportunities and address existing challenges
•    Looks beyond the industry to find and implement best practices that help teams better understand audiences, new technologies, and generational shifts
This new process is called iterative journalism, which we will explore in detail in chapter 3 of this book.