In September 2007, Dan Wagner left the office of his Chicago economic consulting firm for the final time and walked up Michigan Avenue to join a presidential campaign. The twenty-four-year-old had been a consultant for two years. A self-described motorhead who had grown up in the Detroit area fixing cars with his father, Wagner had exulted when he learned one of his clients would be Harley-Davidson, only to later despair when the assignment left him isolated for a year crunching numbers in a small room in the bike maker’s Milwaukee headquarters. He thought back to something he once heard from Steven Levitt, who taught in the University of Chicago’s economics department while Wagner was studying there as an undergraduate major. Levitt told his students they would be smart to live below their means, so they could always have the flexibility to afford taking a different job if it was lower-paying. Wagner had never worked on a campaign, but like many of his generation who yearned for greater meaning from public life, he was enthralled by his state’s junior senator. When Barack Obama announced his candidacy for president in February 2007, Wagner looked at the eight thousand dollars he had saved from his consulting paychecks and decided he was ready to deplete it in the service of heeding Levitt’s advice. Obama’s campaign didn’t care that Wagner’s résumé was devoid of political accomplishments; in fact, it was the experience Wagner described making economic-forecasting models that would help Obama get elected president.
Wagner was dispatched to Des Moines to serve as deputy manager of Obama’s Iowa voter file, at a salary of $2,500 per month. He did not know quite what to expect, and some of his first tasks demanded frustratingly little of his expertise. Because the campaign’s state office did not yet have any tech staff, responsibility for fixing colleagues’ sputtering Outlook software fell to the closest thing there was to a computer guy, which was Wagner. One day, Mitch Stewart, Obama’s Iowa caucus director, walked over to his desk and threw a stack of handwritten cards upon it. “Enter these into the VAN!” Stewart said.
Supporter cards, signed pledges to caucus for Obama the following January, were a long-standing feature of political life in Iowa. Perhaps because they would end up literally standing with their preferred candidate at their precinct caucus, Iowans had little expectation of privacy and were generally happy to commit their support in writing. Much of a campaign’s canvassing operation would be devoted to collecting the cards and using them to perform triage on the electorate: putting pledged supporters aside as eventual turnout targets (or cultivating them as volunteers) and identifying the undecided as persuasion targets, all while trying to discern which voters were committed to an opponent, so those could be cut out of future efforts. Every day for a month, Wagner would decipher the name and contact details scribbled on the cards, transcribe them into a computer, and try to match it to a record in the Iowa voter file, which contained richer information on the signer’s party affiliation and vote history.
The VAN made this easy. The Voter Activation Network was the Web interface that field organizers used to interact with the huge voter databases maintained by the campaign. The software was developed for Iowa Democrats in 2001, shortly before the federal Help America Vote Act had been enacted to reconcile the patchwork of inconsistent local election laws that came to be viewed by many, especially on the left, as a national outrage. The law encouraged states to centralize their electoral data and organize their voter files in standard formats that for the first time made it easy to manipulate records across state lines. By the 2006 elections, Democrats had access to two competing national databases, one controlled by the national party and the other by Catalist, and the VAN emerged from a pack of state-specific interfaces to become the national standard for voter contact across the left.
By the time Wagner familiarized himself with its features, the VAN was nearly a full-fledged digital substitute for the clipboarded voter lists and large wall maps that were the familiar trappings of campaign fieldwork. The software assigned every individual a unique, seven-digit VAN identification code that was supposed to serve in essence as the political equivalent of a Social Security number—a durable marker that would stick with a voter throughout his or her lifetime. In a mobile country, voters would no longer be traceable only as a name at a fixed address but could be followed when they relocated, even across county or state lines, and their political behavior collected throughout. That individual record could be synced instantly across platforms, so that once Wagner entered a supporter’s name and address off a card into the VAN, any canvasser who called up the individual’s name on a Palm Pilot application would see that they had been marked, for instance, as a GOTV target with no need for further persuasion.
What Wagner didn’t know is that those supporter cards were also helping to make similar determinations about voters who had not pledged themselves to Obama and might never communicate with the campaign at all. The data Wagner entered—along with all the records of door knocks and phone calls made by Obama’s growing army of staff and volunteers across Iowa—was being fed into a database that linked up nightly with racks of computers that filled a small room in a converted Capitol Hill apartment building just blocks from the United States Senate. Most days, the only noise that competed with the whirring of the fans tempering those computers’ processors was the sound of Obama himself asking for money, audible through a thin wall from the political office that the freshman senator used to make his fund-raising phone calls. Both would prove to be essential engines of his rise.
Sandwiched between the heroic presidential candidate who positioned himself as uniquely able to loosen a nation’s intellectually sclerotic politics, and the unrivaled hordes of volunteer activists and supporters who believed in him, sat one of the vastest data mining and processing operations that had ever been built in the United States for any purpose. Obama’s computers were collecting a staggering volume of information on 100 million Americans and sifting through it to discern patterns and relationships. Along the way, staffers stumbled onto insights about not only political methods but also marketing and race relations, scrubbing clean a landscape that had been defined by nineteenth-century political borders and twentieth-century media institutions and redrawing it according to twenty-first-century analytics that treated every individual voter as a distinct, and meaningful, unit. “It wasn’t something they were doing off on the side anymore; it was integral to how they did everything,” says Jeff Link, who oversaw Obama’s paid-media spending in southern and western battleground states. “That was the first campaign where you had that level of integration.”
The 2008 Obama campaign would become, in a sense, the perfect political corporation: a well-funded, data-driven, empirically rigorous institution that drew in unconventional talent ready to question some of the political industry’s standard assumptions and practices and emboldened with new tools to challenge them. “It was like the old Bell Labs,” Larry Grisolano, a senior Obama strategist, says of the analytics teams assembled at Chicago headquarters. “They had a lot of ability to create and innovate without being concerned what the outcome was. There was a laboratory attitude with those guys. It was the overwhelming culture of the campaign.”
THE SECOND-FLOOR CAPITOL HILL OFFICE with the computers, in reality little more than a closet-sized apartment, was the official Washington address of Ken Strasma, whose black-box algorithms had become something of a legend in Democratic data circles but were almost entirely unknown outside them. His appearance made the case for anonymity. Tidy and plain-featured, the Wisconsinite looked like he came out of middle management in the middle of the country in the middle of the last century.
Strasma had worked as research director of the National Committee for an Effective Congress, which started mapping and scoring precincts in the 1970s to give Democrats their first resource for systematically targeting voters. In the late 1990s, Strasma had watched as the precinct was displaced by the individual as the essential unit of targeting, a shift manifest in the form of a personal rivalry between his boss, Mark Gersh, and Hal Malchow, played out in sparring memos that did little to mask the two men’s mutual resentment. Strasma may have worked for Gersh, but Malchow’s approach won his heart and mind. While working for state legislative candidates in Minnesota in 1996, Strasma conducted large-scale polls in each of the small districts, and used the results to find personal targets, based on voter-file attributes and Census-tract data, much as Malchow had done earlier that year in Oregon. “We were tiptoeing into the individual level,” Strasma says. “I was doing microtargeting before we had a name for it.”
After 2002, the ability to do that type of individual-level targeting improved significantly. Greater computer speeds made it easier to swiftly churn through millions of records. Perhaps most important, the release of data from the 2000 U.S. Census created a reservoir of free, up-to-date information unavailable elsewhere; tract-level figures that in 1998 were nearly a decade old had been refreshed to account for years of movement and demographic change. “At the time, individual targeting was basically nonexistent, so anything we did would be an improvement,” says Strasma. But it threatened NCEC’s monopoly as a provider of targeting guidance, and Gersh took Strasma’s development of the new specialty as something of a betrayal, like a child abandoning his parents’ faith to join a cult. “It seemed like the same general field, just going into another niche,” says Strasma. “I tried my best to tiptoe very carefully around the politics of it.” In 2003, Strasma quit NCEC to open his own firm, Strategic Telemetry, which rightfully evoked a distant scientific frontier inaccessible to the naked eye and traditional tools. His firm’s products would be what Strasma called “virtual IDs.”
The “hard ID”—a voter who tells a caller or canvasser which candidate he or she supports—remained the truest currency in predicting support, a certain vote as long as the voter could be turned out. But no campaign had ever been able to hard-ID every voter in its universe, or even a majority of them. The costs or volunteer demands were almost always prohibitive, and it was getting much harder: the proliferation of cellphones and caller identification made it simply impossible to get through to a significant share of the population. Except in precincts where they had a strong partisan advantage, campaigns would often be forced to forget about those they couldn’t reach and turn out only those whom they had individually identified as supporters.
Strasma believed it would be possible to simulate IDs for the whole electoral universe, regardless of whether the campaign was ever able to talk to voters directly about their preference. By writing statistical algorithms based on known information about a small set of voters, he could extrapolate to find other voters who looked—and presumably thought and acted—like them. If he could identify enough matchable variables from one set to the next, the campaign could treat these virtual IDs as an effective replacement for hard IDs where it couldn’t get them.
Strategic Telemetry’s first client was John Kerry, and its initial project was to develop a computer model that could virtual-ID participants in the Iowa caucuses. The differences between caucusing and voting were beyond semantics, and a unique information culture had developed around the distinctions. Campaigns approached the caucuses by developing a two-tier system for counting backers: there was the usual system of hard IDs, in which canvassers ranked voters from 1 to 5, the spectrum from a strong commitment to support Kerry to an equally strong commitment to one of his rivals. On top of that, Iowa caucus campaigns had a robust tradition of asking caucus-goers to sign supporter cards vowing their commitment, which were typically treated by both sides as an inviolable pledge. “If they answer on the phone ‘I’m supporting Kerry,’ that tells us that at the moment they felt they’re supporting Kerry,” says Strasma, who had first worked in Iowa in 1988, entering Dukakis supporter cards onto computers. “If they sign the card, they’ve actually done something for us.” But voting was not that simple: delegates were awarded proportionally from each of the state’s 1,784 precincts, and a candidate had to receive 15 percent at an individual caucus site to emerge from it with any. If a candidate failed to meet that threshold, he was effectively eliminated at that site. His supporters disbanded and were free to walk to another candidate’s corner. This complex system meant that campaigns had to build statistical models specifically for the Iowa caucuses. Strasma conducted a brief ten-thousand-person survey, a far bigger statewide sample than typically used by caucus candidates, asking voters how likely they were to turn out the following January and whom they supported.
The voter files that the Iowa Democratic Party sold to candidates are rich with historical information, including allegiances like membership in the party’s rural and gay-lesbian caucuses and past hard IDs that the candidates are required to return to the party after each year’s voting. In addition, Strasma collected some of his own data, such as a list of those who had applied for a tax benefit that Iowa extends only to military veterans. Instead of a two-sided prediction, Strasma had to develop multidimensional scores that would predict an individual’s likelihood of supporting each of the top candidates, including Howard Dean, Dick Gephardt, and John Edwards—to calculate precincts where certain candidates, including Kerry, would fail to meet the 15 percent threshold, or where rivals would do so, allowing Kerry to claim their orphaned supporters before the second round of voting. “The Iowa caucus was the best possible petri dish for this stuff,” says Strasma.
He tuned his system to serve as a fulcrum between the campaign’s data processing and its field organization. Every evening by midnight, Kerry’s field staff was required to input the results of that day’s canvass, including supporter cards, into a computer system. Strasma would wake at 4 a.m. to see what his algorithms had done to the numbers. Based on the new predictions, Strasma would update vote totals for every precinct in Iowa, which had been established to keep Kerry on pace to meet the statewide delegate goals necessary for victory. By 8 a.m., that report would be sent to Kerry’s Iowa caucus director, Jonathan Epstein, so he could adjust resources daily to make sure they were being committed to precincts where Kerry stood to gain, or protect, the most delegates. Epstein would hassle Strasma if he was fifteen minutes late in delivering what the field staff referred to as their “crack sheet.”
Those spreadsheets, based on real and virtual IDs, gave Kerry’s campaign hope through the autumn of 2003, when Dean’s rise in the polls appeared to eclipse Kerry’s standing as front-runner in Iowa and nationally. “At the time, with my friends and family—when I told them I was working for Kerry, they would act like my dog had died or something,” Strasma recalls. But he was always far more confident about Kerry’s standing. Kerry was hitting his targets, while Dean’s support seemed to be slowly buckling in ways that polls, and his own strategists, were unable to pick up. First off, Dean backers seemed to be densely packed into precincts, especially those around college campuses, where they could overwhelm rooms on caucus nights but fail to materialize any extra delegates from it. More surprising to Strasma, Dean canvassers did not appear to be going back to people once they had been ID’d as supporters, banking them as a vote even as the race’s dynamics changed. The Kerry campaign watched this phenomenon among its own supporters: they called it a “flake rate,” which Strasma quantified and monitored closely. He measured the speed at which voters peeled away from Kerry to support another candidate, and again as onetime supporters of other candidates switched over to Kerry. But Dean’s campaign, blinded by encouraging polls and press coverage claiming their candidate’s certain victory, didn’t seem to notice until too late.
The Dean experience hung over Obama’s Chicago high-rise headquarters and every one of the three dozen field offices it eventually opened in Iowa. Shortly after Obama had announced his candidacy, Larry Grisolano and Pete Giangreco had gathered in Chicago to sketch early vote goals for Iowa. Neither man’s portfolio gave him specific responsibility for turning out caucus-goers—Grisolano was Obama’s paid-media and opinion research director, and Giangreco his lead mail consultant—but the campaign was still hiring its field staff and the two men had both worked in the Iowa caucuses since the 1980s and were familiar with its peculiar practices. They knew that Obama, competing with Hillary Clinton and John Edwards, would have a tough time cracking the insular pool of reliable caucus-goers. If turnout was around 125,000, as it had been in 2004, Obama would have no shot of breaking through. Total turnout, Grisolano and Giangreco concluded, would have to reach 180,000, far more Democrats than had ever before participated, a particular challenge in a year when Republicans had their own wide-open primary fight. (Iowans could participate in either party’s caucus, regardless of their registration before caucus day.) Obama would succeed only if he could enlist tens of thousands of new caucus-goers, many of them young people traditionally underrepresented there. This was sensible as a strategy, but it invited a comparison that bedeviled Obama’s advisers. They knew that after the 2004 outcome, the world was ready to dismiss the arrival of another antiwar candidate pledging to deliver a caucus coalition of liberal activists and young, first-time caucus-goers. “In the political community, it was ‘Obama’s got all this buzz but it’s just like Dean.’ We were hearing that and, to a certain extent, it was true. Our challenge wasn’t how to pretend it wasn’t true. It was how to turn that into an advantage,” says Strasma. “We had almost the perfect blueprint. What we needed to do was avoid the pitfalls that had befallen Dean.”
The supporter cards that Wagner was processing in Des Moines were feeding into the computers at Strategic Telemetry’s Capitol Hill office. Those commitments, along with some traditional polling, had already helped to refine Obama’s back-of-the-envelope vote goals in Iowa. But the real power of Strasma’s black box, like all microtargeting models, was extrapolatory: the names of those who had signed supporter cards went in, and out came the names of other Iowans who looked like them. These algorithms were matched to 800 consumer variables and the results of a survey of 10,000 Iowans. Going a step further than the Kerry campaign, Strasma wanted to create a model that would help Obama’s advisers decide which topics they should use when communicating with its targets. Strasma’s polls asked voters for opinions on eight issues, and separately, asked for their top two concerns. Obama’s pollsters had realized that if they called likely Iowa caucus-goers in the summer of 2007 and asked what issue was most important to them, nearly everyone would say Iraq. When they asked for the top two, it opened a Pandora’s box of progressive worry: the environment, health care, civil liberties.
But Strasma was also looking for people who weren’t on the Democratic rolls, or even yet voters. Iowa residents who would turn eighteen by November’s election day were allowed to participate in caucuses, but no campaigns had ever gone after the population of eligible seventeen-year-olds, in part because no one knew who they were—since they weren’t registered and had no political history, they didn’t show up in the state voter file. The Obama campaign, desperate to reach its 180,000 target, created a “BarackStars” program to contact Iowa high school students, and it was Strasma’s job to help field organizers find them. “I had never before been involved in a campaign where that was such a rich vein to mine,” he says. Strasma acquired lists of high school seniors who had taken the ACT college admissions test, names typically marketed to college admissions officers seeking to mail potential applicants. Separately, the campaign had student supporters gather school directories, but—fearing that it would look creepy if it had adult phone banks calling high schoolers—created a system for young backers to call their peers. “A massive program targeting seventeen-year-olds is fraught with peril,” says Strasma. His models treated most seventeen-year-olds in the correct birthday range as strong Obama targets, except for those with a Republican for mother and father. “We assumed kids wouldn’t necessarily go against both their parents,” says Giangreco. As the BarackStars initiative progressed, Obama’s Iowa team worked to avoid a mistake Dean had made with college-age supporters. When Strasma’s scores identified Iowa college students as targets, a field staffer would call to convince them that it was more useful for them to caucus at their home address than at their school, to disperse them in precincts across the state and not just pack them into those surrounding campuses. Strasma suggested this would also have a beneficial plan-making effect: talking aloud about the details of where they would vote was likely to increase their chances of following through on it.
Strasma’s craft was in writing algorithms, and his currency was the scores that emerged from them to predict individual behavior. Each score, calculated out of 100, reflected the percentage likelihood that a person would perform a certain act. For every Iowan in Strasma’s database, including Republicans and unregistered seventeen-year-olds, Strasma produced two scores measuring the basic questions every campaign had when it looked out over the electorate. What were the odds someone would vote? And whom was he or she likely to back?
The first of these was known as a turnout score, calculating as a percentage the likelihood someone would participate in the Democratic caucus. The second was called the Obama support score, which indicated the probability that he or she would support Obama, even if it was unlikely he or she would show up in the first place. (For this reason, Strasma saw a lot of Republicans with turnout scores close to zero and Obama support scores near 100: the algorithms determined that they were very unlikely to attend the Democratic caucuses, but if they did so would almost certainly end up in Obama’s corner.) Strasma also generated individual support scores for Obama’s opponents, which allowed field organizers to change their tactics for each precinct. These scores all ended up on the voter file, so field organizers putting together local walk lists or call sheets could just call up names within certain score ranges for persuasion or GOTV contact. Strasma liked to analogize the scores to precinct averages. If 100 people with 75 percent Obama support scores went to the polls, he would get 75 votes. Among a group of 100 people with 40 percent turnout scores, there would be 40 voters.
Because Strasma had generated predictive scores for every voter, and not only those whom the campaign had directly identified or solicited for supporter cards, Obama’s team had remarkably good intelligence on where the opposition’s support was located and could plug it into their turnout projections. With that information, Obama strategists knew where it made most sense to call Hillary Clinton backers—in the hopes that converting a small number of them to Obama’s side could keep Clinton under a threshold that would grant her campaign an additional delegate—or where John Edwards was uncertain to prove viable and his supporters could be persuaded to consider Obama as a second choice. While Obama’s voter file did not initially include support scores for Ohio congressman Dennis Kucinich, a liberal gadfly unlikely to contend for delegates, Strasma added one after seeing in polls that Kucinich’s strongly antiwar supporters would almost unanimously default to Obama when they had to make a second choice. Based on that information, the state director, Paul Tewes, lobbied Kucinich to issue a statement endorsing Obama as a fallback, which Obama’s campaign was then able to get to everyone whom the scores predicted to be a likely Kucinich supporter.
The arrangement with Kucinich, and a similar deal with New Mexico governor Bill Richardson’s campaign to swap second-round backing in cases where one of the two failed to be viable, represented an odd moment for Obama, who often renounced transactional politics as old-guard tactics. His campaign had a conflicted relationship with the things one has to do to win elections. Obama, the former community organizer, believed in a certain purity of grassroots politics, equal at least to the contempt with which he dismissed opponents’ political activity as craven or the media’s interest in the contest as superficial. At the same time, Obama and his spokespeople bragged incessantly about the campaign’s mechanistic accomplishments: how many volunteers they had enlisted, dollars raised, text messages sent. It is little coincidence that these were all numbers. Early on, campaign manager David Plouffe had insisted that the campaign try to measure everything it did as a method of gauging its effectiveness, and that sense of data-intensive empirical rigor quickly moved into all corners of a campaign that would become the largest in history. “We had a lot of money, but we were incredibly efficient with our spending, and that came from Plouffe every single day,” says Link. “He just didn’t like spending money. For a guy who spent a lot of money in the election, it killed him to authorize a check.”
But as money continued to come in, Obama’s campaign was able to create increasingly specialized roles around data and technology. Wagner’s job as the one-man IT team for the Des Moines headquarters disappeared, its component roles spun off; after a month of processing piles of supporter’s cards, Wagner’s data entry duties fell to someone else, as did the need to fix the office computers. His schedule freed up, Wagner committed himself to building a software program that could guide tactics for Obama representatives at each caucus location. He went about it by rewriting each of the unusual rules and protocols of the Iowa nominating process—the viability thresholds, the multiple rounds of voting and subsequent realignments, the proportional allocation of delegates—as a series of interlocking game-theory problems. When did it make sense to release some of Obama’s supporters to a rival to keep him in play for another round? Or how many of a no-longer-viable candidate’s supporters would Obama need to pick up to qualify for an extra delegate? The program Wagner wrote, the Caucus Math Tool, was loaded onto laptops that campaign representatives could bring to their precincts. Its straightforward interface required only entering each candidate’s tallies after every round of voting and would deliver practical instructions on how to adjust for the next one. (The broad objective of every move was to block Hillary Clinton from accumulating delegates, regardless of who won them instead.)
At headquarters, campaign officials knew well before results from the first round of votes were typed into Caucus Math how things would go. As voters arrived at precincts across the state for the six-thirty caucus start time, Obama volunteers would call a hotline to report the number who had checked in. The numbers on Wagner’s computer were rising faster than anyone had anticipated. By the time local supporters wrapped up their speeches, it was clear to Obama’s strategists that they had easily met their goal of delivering 180,000 Iowans at the caucuses. In fact, the final number ended up being 239,000. News reports relied on exit polls to describe who they were: half were participating in a caucus for the first time, and the share of voters under the age of twenty-five had tripled from Kerry’s win four years earlier. Very late that night, Wagner went to sleep, barely budged from his bed for two days, and then drove straight east to Chicago.
AS SOON AS he arrived back at Obama’s national headquarters, Wagner was pulled aside by Jon Carson, who had been charged with handling the campaign’s preparations for February 5. Twenty-four states would vote that day across four time zones, a combination of simultaneous primaries and caucuses heretofore unparalleled in its scope and influence on the nominating contest. Obama had begun paying attention to the delegate count well before Clinton, and starting in the summer of 2007 Chicago had become attuned to the varied and byzantine ways that delegates were awarded around the country. While Clinton appeared to be ignoring those states she was unlikely to win outright, especially those with caucus systems, Carson had developed distinct tactics for each state with the goal of maximizing his delegate haul nationwide. Some campaign departments were well prepared for this shift. Obama had begun in early 2007 to develop a robust volunteer and fund-raising network in all fifty states to exploit his popularity among Democratic activists. But Strasma’s microtargeting work during that period was focused almost exclusively on Iowa, with little effort to formalize efforts elsewhere. As a result, the data team’s early delegate projections were based on simple demographics: the numbers of African-Americans and Democratic voters under thirty in each district. Analysts tallied them on a whiteboard as “Group A” and “Group B,” to keep the simple classifications opaque to reporters tramping through headquarters hoping to pick up hints of campaign strategy.
Obama’s data operation was forced to grow quickly to meet the circumstances of a national race. In late 2007, Strasma’s scores showed up for the first time on the state voter files maintained in Obama’s New Hampshire, South Carolina, and Nevada field offices. In Nevada, the VAN was the province of Ethan Roeder, a graduate of the School of the Art Institute of Chicago who had learned how to use databases while preparing donor reports for Lambda Legal, a gay rights advocacy group. When he won a job with the Obama campaign, that experience was enough to earn him an assignment to Las Vegas as the state voter-file manager. Until Strasma placed microtargeting calls into his state, Roeder had assembled target universes without help from modeling scores. He would look at polls, find population segments that were inclined to support Obama—mostly young whites and African-Americans of all ages—and use geography and basic demographic categories on the voter file to compile walk lists and call sheets for the state’s growing crew of field organizers. When Strasma’s support and turnout scores showed up in the VAN, Roeder saw how much more finely the electorate could be broken down. “We were dealing with chunks that big, and he comes to us with these small slices,” says Roeder.
In campaign offices around the country, a generation of political data experts was being born, forced by exigency to learn how to manipulate voter files and invited by a decentralized campaign to improvise with them. Jim Pugh, who worked in Chicago on online analytics, was still finishing his dissertation on robotics at the École Polytechnique Fédérale de Lausanne, in Switzerland. Matt Lackey had worked on nuclear reactor design for Westinghouse before starting his own comic book business; he ended up managing the Indiana voter file. John Bellows was a graduate student in econometrics at Berkeley and yet another campaign neophyte before he walked into a California field office during the primary season. Because Obama was not aggressively contesting California, Strasma had never built a model for the state. Instead, the campaign decided to open up access to the VAN to anyone who registered as a precinct captain. Bellows used the opening to assemble his own statistical models to identify likely Obama supporters, but the campaign’s state director, Buffy Wicks, was at a loss for what to do with them. She put Bellows in touch with Wagner. “I don’t know what you two nerds are talking about,” she said.
Strasma’s national projections were sharpened after Iowa, when he ordered his first calls into the February 5 states. In many of them, self-organized volunteer cells had already been canvassing voters for months. Strasma took their hard IDs and the new results of his large-sample polls and fed them into the algorithms to develop state-specific scores and assign them to voters. Carson built a small February 5 team to work through all the new information coming in from around the country and recruited Wagner to be its targeting analyst. In the month before February 5, Wagner would arrive at headquarters at seven each night for a twelve-hour shift, processing the numbers that came out of Strasma’s computers overnight and assembling them into a daily report that could help Carson move resources among the states. No longer would the campaign rely solely on an outside consultant for an interpretation of the microtargeting models he had developed. “The analyst has to be inside, so that the campaign manager can look to his right and say ‘What’s going on?’ ” says Wagner. “He has to be able to answer two questions: Are we going to win or are we going to lose? And what the hell are we going to do about it?”
The answers to those questions shifted regularly, as Strasma’s support scores recalibrated to account for a changing race. Previous campaigns that had used microtargeting usually ran their numbers a single time. They conducted the polls, gathered individual-level data, performed the analysis, and gave everyone a score or a segment that stuck with their voter-file record for the whole campaign. It was treated as an inflexible personal attribute, like gender. There were perils in doing a onetime microtargeting project, however. When Bush’s team began its large-sample surveys in late 2003, a year before the election, many of the “anger points” questions had been drafted to prepare for an anticipated contest against Dean, who at the time led Kerry considerably in polls. One tested attitudes toward recitation of the Pledge of Allegiance in schools, the subject of a case just accepted by the Supreme Court that Republicans thought could offer a successful wedge issue against Dean.
Primaries presented an even bigger need for fresh microtargeting information. Unlike in a general election, a voter’s partisanship was rarely predictive of how he or she would vote, because most of those who cast ballots in a primary were members of the same party. In those cases where it was predictive—in his presidential campaigns John McCain repeatedly did better with non-Republicans who voted in Republican primaries—it was not always intuitive. Because the major fault line of party didn’t exist, lots of little ones (demographic, ideological, geographic, issue-based) emerged, intersecting in complex geometries that required nimble analysis of many variables. More than ever, the basic poll subsamples that could handle only three or four overlapping voter characteristics at once were insufficient. Microtargeting solved that problem, but a single-shot approach couldn’t keep up with the fluidity of primary-season opinion—voters seemed quicker to change their minds when parties weren’t hemming them in. It was possible to have a very good microtargeting algorithm that gave the wrong answer to the most important question of all, just because the which-types-of-people-support-whom data went stale before it could be used.
Indeed, as they talked with voters around the country, Obama’s field operatives lost faith in the turnout scores that Strasma’s algorithms were sending them. The two major variables that went into them were vote history and the self-reported answer when survey callers asked how likely someone was to vote. Strasma liked to think of the first as measuring if someone was a regular voter, and the second as a gauge of their current enthusiasm. He knew that people always overestimated their chances of voting when asked by a stranger, but now vote history was becoming increasingly problematic for planning turnout. Until Barack Obama won Iowa there had never before been a viable African-American candidate for president, and Strasma’s algorithms struggled to measure this historic change. Predictive models were built on established behaviors, and Obama’s campaign was now operating in a space without precedent.
The support scores, fed with new IDs from paid call centers and volunteer contacts, moved around in a way that accurately reflected the volatility of the race. They showed that the spine of the Iowa microtargeting model remained intact: voters were unlikely to change their views on issues or their election-year priorities, and the attributes that made someone a likely Obama caucus backer traveled well. Highly educated and upscale voters pulled high Obama-support scores around the country, as did African-Americans and the young. But it was evident from any newspaper poll that voters were moving around among candidates. Edwards was fading from contention, allowing Clinton to solidify her position as the candidate of older, rural white voters.
To the outside world, Obama’s campaign had become notable for its digital prowess, which was given much of the credit for his ability to outmaneuver Clinton’s establishment support. But in Chicago, the long march through what turned out to be fifty-six primaries and caucuses exposed major weaknesses in a data infrastructure that Carson likened to a Rube Goldberg apparatus. The campaign was gathering an unprecedented volume of information that individual Americans volunteered about themselves. There were the thirteen million people who would sign up for updates from the campaign website, plus those who gave their phone numbers to receive a text-message alert announcing Obama’s vice presidential selection. There were the three million donors, many of them contributing small amounts but supplying the campaign with personal data required by federal campaign finance law. Then there were the regular streams of those who put their names on a sign-in sheet at a field office or a clipboard outside a campaign rally.
But the campaign was unable to bring these rivers of data together. Chris Hughes, a founding executive of Facebook who became Obama’s director of online organizing, battled often with Blue State Digital, the consulting firm that had grown out of Dean’s campaign and built Obama’s website. The parts of the online operation facing outward performed brilliantly, engaging supporters in volunteer activity and raising money from them. But the Blue State site was incapable of linking to databases used by other parts of the campaign, leaving different types of personal records—financial contributions, online contacts, field IDs—isolated in their own silos. The campaign didn’t know whether a Marjorie Jackson who gave one hundred dollars to attend a concert fund-raiser was the same one who put in two volunteer canvassing shifts the previous weekend—and if she was a married white Republican living in the suburbs, Strasma’s algorithms may have predicted her to be a likely McCain voter. If she had written in an online sign-up form that she cared about the environment or a woman’s right to choose, that information was unlikely to find its way to someone planning a direct-mail piece addressing one of those issues.
As the primaries wound down, Plouffe decided to reorganize the campaign’s data and targeting operations entirely for the general election. He assigned chief of staff Jim Messina to enter surreptitious negotiations with Catalist on a deal to use its voter files rather than rely on those built by the Democratic National Committee, despite the fact that the company was distrusted by many Obama supporters because it had close ties to Hillary Clinton’s world and had supplied her campaign with its data. At the same time, Carson convened a group tasked with designing a new targeting structure and invited Wagner and Roeder to be part of it.
Like many junior staffers pushed into roles as targeting analysts, they had chafed throughout the primaries at their inability to know exactly how Strasma made the projections behind his scores. There were active debates about different statistical techniques for finding relationships between variables, and Strasma’s black box relied on a complex formula that combined several of them. But none of the campaign staffers who spent their days processing data for Strasma’s algorithms was able to see how they actually worked. It was the way business traditionally worked for an outside consultant or vendor like Strasma, who felt he had a unique skill to protect: the campaign ordered a service from him, and he delivered a final product. No one had ever expected a media consultant to publish the f-number to which he set his cameras so that campaign staffers could try to reverse-engineer the distinctive look of his ads. Why should Strasma be expected to explain his algorithms? Yet for the Obama staffers who came from academic backgrounds, where researchers published all their formulae for wider review—or merely tech geeks from a generation with an open-source attitude toward collaborative software development—the opacity of Strasma’s shop was particularly frustrating.
Carson believed that the process Obama’s campaign had used during the primaries, with Strasma’s firm producing models as they became necessary and analysts like Wagner scrambling to interpret them, didn’t make sense for the general election. Instead he recommended to Plouffe that the campaign bring its data and targeting operation in-house. Strasma would maintain a consultant’s role overseeing it all, and his computers on Capitol Hill would continue to churn out the scores. But data would now be treated as a core internal function with day-to-day needs, like communications and research, and not a boutique service that had to be done by a specialist off-premises. “It was the beginning of the move away from the Wizard of Oz model of microtargeting,” says Michael Simon, who had worked with Strasma on Kerry’s campaign and joined Obama’s.
That shift was soon reflected in the campaign’s feng shui. Along an east-facing wall at headquarters, staffers had moved furniture to create a daisy chain of six pods that would be called Team 270. There was one pod per region—the Great Lakes, Midwest, Southwest, West, South, and Northeast—each with seven specialists, covering such basic campaign functions as press and scheduling visits by surrogate campaigners. Regional desks were nothing new in campaigns, but the presence of a specialist each for data and targeting was; in the states, another set of officials would also do nothing else. Each state headquarters would have a data manager, who was charged with managing the voter file and augmenting it with unique sources of information by acquiring lists of veterans collected by local governments or persuading state parties to share their IDs from past campaigns. A separate targeting director would help translate the models for those on the campaign who used the scores for voter contact.
The campaign’s obsession with documenting metrics meant that it was generating far more potentially useful data than any of the pods would have time to sift through. Simon, who became Team 270’s lead targeting analyst, enlisted a group of Democratic consultants who weren’t formally affiliated with the campaign as what he called the targeting desk’s “kitchen cabinet,” a panel able to take on discrete research questions beyond the purview of his department’s daily operations.
Simon introduced the kitchen cabinet to a largely secret stockpile of data known within the Obama campaign as the Matrix. It was a centralized repository that would gather every instance of the campaign “touching” a voter, as field operatives like to put it, including each piece of mail, doorstep visit, and phone call, whether from a volunteer or a paid phone bank. It had its origins in the Iowa Contact Matrix, which compiled each contact that came in through the VAN so field staffers could track their activity. When, two months before the caucuses, Strasma noticed that one woman had been reached 103 times and remained undecided, he considered asking Obama to call her personally.
For the general election, the Matrix was expanded to include non-targeted communication to which an individual was exposed, including broadcast and cable ads and candidate visits to their media market. This created a ready-made data set that could be used to answer questions that campaigns rarely tried to ask, let alone answer, with any methodical rigor. (Obama’s campaign never seriously undertook randomized field experiments for individual voter contact. There had been talk, shortly after Obama won the primaries, about sending rounds of general-election persuasion mail and measuring their effects through polls, much as the AFL had done four years before, but campaign officials felt they were too cramped for time to properly build such a system.)
At one point, Simon invited Aaron Strauss, who worked for pollster Mark Mellman’s firm, to see if he could identify whether Obama’s travel schedule and television buys were moving voters’ support scores. “The effects of any campaign activity are ephemeral,” Strauss says of his findings. “Just because you touched someone in, let’s say, the beginning of October doesn’t mean they’ll be with you two weeks or even one week later.” Strauss’s project never changed the campaign’s tactics for the sake of research—Obama and his ads still went where Plouffe and his strategists thought they would be most electorally effective—but the massive volume of existing data being automatically collected by Chicago’s computers opened up rich new possibilities to measure the campaign’s impact. “The presidential campaign was just nothing like anything else that ever exists,” says Judith Freeman, Obama’s new-media field director. “You could try out things with all the data—it was totally just scale.”
ON THE NIGHT OF JUNE 3, Barack Obama strode to a podium at the Xcel Energy Center in St. Paul, Minnesota, unsubtly chosen as the place where Republicans would formally nominate John McCain exactly three months later. McCain had already been campaigning against Obama for three months, having vanquished the last of his primary opponents in March. Obama had continued to battle Hillary Clinton during that time, straight through the final primaries in Montana and South Dakota, where polls had just closed as Obama prepared to speak in Minnesota. The race had been decided on the accumulation of delegates, a reality to which Clinton’s team had not adjusted until it was too late. “Because of you,” Obama said, “tonight I can stand here and say that I will be the Democratic nominee for the president of the United States of America.”
In Portland, Oregon, sixty Obama staffers and volunteers watched the speech on MSNBC, making their way through cases of Pabst Blue Ribbon. Obama’s campaign had cleared out of its state headquarters in a former Wild Oats natural foods market after his resounding victory in the Oregon primary two weeks earlier, but the workspace still had its computers plugged in, so it became host to a “data camp.” The assembled group represented, by traditional campaign standards, a gang of misfits. The only prerequisites for winning an invitation to Portland—and a subsequent job in the new regime—was being “someone who understood field, and the usefulness of data,” as Simon put it. By that measure, many arrived in Portland well prepared, having bounced around among primaries in as many as a half-dozen states, each time refining their procedures and trying new things. “They had been working on the most sophisticated campaign in history for a year,” says Mike Moffo, the Nevada caucus field director. “That’s real experience even if it’s your first campaign.”
Over the two-week-long data camp, Strasma and Simon led basic training that gave many in the room their only formal introduction to the world of voter files and models. Simon took them through a voter file, projecting a list of individual records onto the wall as though a simple spreadsheet: in the leftmost column, there was the VAN ID, the unique code that a voter unknowingly ported through life. Then columns rained down from left to right, each an individual attribute: an address, gender, age, race, party registration, vote history. Simon would go through a row, using demographics to sketch a person in profile: the young Hispanic woman or the middle-age white man with a Rural Free Delivery address.
“What is this voter?” Simon asked. “Obama or McCain?”
Most of the time it was comically obvious, something anyone with a basic understanding of American politics would intuit. “The human eye is pretty good at figuring it out,” says Simon.
Simon would let another column fill in, the result of hard IDs gleaned by volunteers that showed voters’ stated preference between the two candidates. But for the 100 voters in the spreadsheet, the campaign had hard IDs for only 10. The task of the modeler, Simon explained, was to direct computers to solve for X, assigning a candidate preference into each of the 90 empty cells based on the 10 that were full. The algorithms that would make those calculations could simultaneously pull in thousands of variables to test for the weight each of them should have in the equation. Strasma had built the core infrastructure with thousands of individual- and precinct-level variables already attached permanently to each record on the voter file. New information that would come in from paid phone IDs could help hone the predictions to reflect enthusiasm levels and preferences specific to the 2008 race.
Carson knew this would put a lot of pressure on the phone vendors who won Obama’s lucrative contracts to identify voters, and he worried about their quality. That corner of the political industry had earned a particular reputation for unscrupulous practices, relying on untrained and unenthusiastic call center personnel who were sloppy about how they recorded voters’ answers. Campaign staff had long suspected that operators sometimes just made up responses, especially when a voter hung up midway through a script and faking a few answers to the final questions meant not having to throw out the rest of a completed survey. Carson conceived a scheme that within Obama’s headquarters was likened to the reality show competition Survivor, and those outside it quickly understood why.
In the early summer, the campaign talked to ten of the party’s top phone vendors on a conference call and told them they were each getting 10 percent of the nominee’s business. Each firm had a different pricing scheme, but the vendors were not going to be judged on price or reputation, as was typically the case. Instead, they were told, they would be tested against one another in real time. Every Sunday night, the Obama campaign would provide each of them a list of VAN identification codes and attached phone numbers, along with a script for callers to use. By midday on Friday, the phone vendors would return a spreadsheet of VAN IDs with the voters’ answers to the script questions. Each week, the campaign would send out a report to all the vendors that showed how they and their competitors had fared under a “cost-per-opinion” metric that calculated what Obama was actually paying for each call successfully completed. Any firm that underperformed would see their share split among competitors.
After a month, half of the phone vendors were gone. The decision had not been made entirely on relative costs-per-opinion, as they had been told it would be. Obama’s team had included a question at the end of each script, asking voters for their age, and often found that what came back from the vendors for that category didn’t match up with the date of birth on the voter file. None of the campaign officials who had privately accused phone vendors of fabricating data had ever thought to audit them in the midst of the campaign. There was no way, short of calling voters back and confirming their answers, to see if a call center had accurately tallied their candidate preference, but date of birth was an independently verifiable fact (and, since it came from government registration records, highly reliable). “It was something they couldn’t make up,” says Simon.
At the same time, two hundred headquarters staffers were instructed to devise a fictive alter ego that would be given a VAN record and placed, with the employee’s actual mobile number, on the lists given each week to the phone vendors. Luke Peterson, the database manager during the primaries, became “Joseph Ratzinger,” a pro-life hard-liner and political independent who was entirely undecided about whether to support Obama or McCain. Three or four times a day, Peterson’s phone would ring and he would answer as Ratzinger; his answers on abortion and his candidate preference would stay stable, but then he was invited to improvise his responses and have fun with accents. Afterward, Peterson would go online and complete a short Web form rating the caller’s performance. Did he or she read the script accurately? Was the caller clear and polite?
Five vendors ended up winning a share of Obama’s business, and they already appreciated how much more it would demand of them than the typical campaign account. During the primaries, Strasma had treated these paid ID calls as roughly equal in value to the so-called field IDs that volunteers entered into the VAN; they all entered the algorithm interchangeably as indicators of voter support or enthusiasm. For the general election, the number of field IDs available for their calculations would only grow, as Obama’s new-media team was working to make it even easier for those willing to place phone calls to do it from their own homes. The new-media team built a calling tool into a prominent feature of the MyBarackObama website, which would automatically assign a volunteer to voters in the nearest battleground state and produce an appropriate script for them to read. They constantly tweaked the design to reduce the number of clicks necessary to actually dial a call, and enlisted a ten-thousand-member National Call Team of committed volunteers who eventually made three million calls through the interface. But over the course of the summer Obama’s analysts realized that their candidate was drawing more support among these contacts than ones reached by paid phone banks; people seemed wary of insulting a volunteer canvasser by announcing they supported another candidate, and often lied to say they were undecided instead. The targeting desk decided that the algorithms would have to weight the paid calls far more heavily than the field IDs.
As Strasma explained in Portland, he had designed a system to turn virtual IDs into a continuous process, where individual probabilities moved in a way that accurately reflected that a person’s propensity for picking a certain candidate, or voting at all, was subject to near-constant flux. He imagined doing for microtargeting what tracking polls had for the once-static study of public opinion. Pioneered in the 1970s by Bob Teeter and Fred Steeper, those continuous small-sample polls relied on several hundred calls every single night, with each batch of new opinions rolling over one another like lapping waves, so that the older ones bubbled away as they were replaced. They lacked nuance—the calls focused primarily on candidate support—but they captured movement at a price that major campaigns and media organizations could afford.
Already in the primary season, Strasma had seen a hint of how the aggregation of individual microtargeting scores could offer a substitute for polls as a way of tracking opinion shifts. Instead of merely relying on a small sample of voters to say what they felt now, Strasma could use the algorithm to extrapolate how every voter on the file might be moving, then look for patterns in their movement. In early 2008, in states like Iowa that made it easy for non-Democrats to vote in the Democratic primaries, Republican support scores for Obama were always higher than those of his opponents. But in the run-up to the Ohio primary, those scores quickly flipped, and Clinton started pulling higher support scores among Republicans. It wasn’t tough to figure out why. On his radio show, Rush Limbaugh was promoting a plan he called “Operation Chaos,” to encourage Republicans to cast votes for Clinton as a way of fomenting further conflict within the opposition. Many Democrats were skeptical the stunt would have much impact, but Strasma’s changing scores confirmed that voters actually seemed to be following Limbaugh’s orders. After Ohio, Obama’s campaign all but abandoned its outreach to Republicans.
Strasma believed his algorithm could help Obama make similar strategic decisions in the general election, as well. Typically a campaign would plan to collect a massive batch of paid IDs in the summer so that they could be used to separate persuadable voters from get-out-the-vote targets with enough time to run an aggressive program making the case to the former. But Strasma pushed Plouffe to take the budget for those IDs and spread them out over the entirety of the campaign; he knew Obama’s field team would lose some of the precision that comes from having hard IDs on voters but they could make it up with more refined predictive models. Strasma proposed a two-tiered system of IDs that echoed the way the Census monitored population changes. Every week, the Obama campaign would hire call centers to do between 1,000 and 2,000 of what Strasma called long-form IDs per battleground state, which would be closer to a traditional poll, with questions about issues and campaign dynamics. At the same time, the campaign would be doing between 5,000 and 10,000 short-form IDs in each state, quick calls that through as few as two questions did little more than gauge a voter’s candidate preference and likelihood of voting. One-quarter of those would always be re-IDs, voters who had been previously contacted and were called again.
After the algorithms worked through the new round of weekly IDs, they would drop a new set of support and turnout scores on every voter’s record in the VAN, each of them represented as a percentage probability. After four weeks, Strasma was able to see which voters were moving between candidates. Eventually they had a large enough sample of those who changed from McCain to Obama, and vice versa, that the campaign was able to create a model of these voters they called “shifters.” It allowed the campaign to refine its category of “undecided,” a catch-all description that long frustrated political scientists and psychologists because it was applied equally to voters who hadn’t made up their minds, weren’t paying attention, were trying to weigh competing values, or were simply unwilling to share with a stranger what many considered a private matter. Someone who was undecided in June was probably a very different type of voter than one who was undecided in October. Using algorithms to find other undecided voters who looked like shifters (and determine which direction they were likely to go) would help the Obama campaign know which ones were worth targeting, and when to do so.
By the time of the Republican convention in early September, the Obama campaign was placing well over one hundred thousand paid ID calls a week nationwide, with all the data feeding into Strasma’s computers. When McCain picked Alaska governor Sarah Palin as his running mate, Obama’s strategists were befuddled: they thought the Republican had been gaining traction by highlighting Obama’s thin résumé, and he now seemed to be sacrificing that argument by putting forward their own neophyte. But one week after the Republican convention, Strasma saw the first sign that McCain’s move might be paying off when the first round of post-Palin IDs came back from phone banks. People were identifying themselves as pro-McCain at a higher rate than their scores suggested they should have been. Strasma bore down into the numbers and saw that the phenomenon was particularly strong among women. Campaign strategists worried that McCain and Palin, running as “two mavericks,” may have been proving themselves successful at seizing Obama’s themes of change and reform.
When the next round of IDs came in, two weeks after the Palin nomination, the IDs told a different story. The models had begun to integrate the increased levels of support for McCain’s ticket, but now the IDs were heading in the other direction, underperforming the scores, especially among Republican women. The modeling scores hadn’t caught up to what voters thought of Palin. The disconnect between the two suggested that Palin’s selection had offered little more than a temporary bump, as opposed to the permanent boost that McCain’s advisers had anticipated. “She ended up being a sugar high for them,” says Giangreco, “and she went away as quickly as she came.”
That eventually became conventional wisdom among media and the campaigns themselves, but Strasma saw it well ahead of the curve: his perfectly efficient loop of IDs cycling through the algorithm had proven itself a useful tool in the arsenal of measuring public opinion at a high velocity. “You would see things faster than the polling would come back,” says Freeman. Once the campaign had developed its modeling score for the action of shifting, it became possible not only to predict what views a voter had but also individual susceptibility to changing them at a given point in the election year. Strasma believed that this predictive modeling gave Obama’s staff the tools of the fortune-teller. “We determined that, down the line, they were going to break for us,” he says. “We knew who these people were going to vote for before they decided.” Now the campaign had to make sure it knew how to reach them.