Experimenting in Public
"Y’all liars,” one mayor said from the back of the classroom, not lightly, as I could tell from the look on his face. And not under his breath, as I could tell from the reactions of the thirty-nine other mayors in the room.1
To be fair, I had possibly set the mayors up. The first time I asked the question (with a different audience), I had no idea how people would respond to it. Not students. Not citizens. Not elected officials. But this was now the ninth or tenth time, and I had an inkling how these mayors might answer.
I’d asked them to put themselves in another leader’s shoes: those of Mayor Miguel Mancera, in Mexico City in 2016. His government had tried to crowdsource a map of Mexico City’s tangled bus system from several thousand volunteer riders. They had mapped what was thought to be 43 percent of the city’s bus routes. Mancera was going to speak to the public about this initiative. What should he say about it? I gave the mayors three options.
Mancera should declare Mapatón a success:
A. Yes, because it was.
B. No, because it wasn’t.
C. Yes, even though it wasn’t.
There are people who answer A because half of a bus map in a place the size of Mexico City gathered over a few weeks with only a small amount of money is surely a success. There are those who answer B because, after all, what does one do with half of a bus map? And then there are the “liars.” Between the last two options on the list, the choice is either admitting failure or “lying,” and people I have polled overwhelmingly picked C. Half the room had done so before the mayor called them out.
I never imagined that would be the case when I first posed the question. I’d also never thought of those who picked “Yes, even though” as liars myself. At worst, they were probably guilty of trying too hard to find a silver lining. But if they were willing to work so hard to avoid acknowledging failure, imagine the lengths to which they might go to avoid being put in a position of risk in the first place.
This is what was on my mind when the mayor called out his colleagues from the back of the room. Not what they were saying or not saying, but what they were willing to try back in their cities—or perhaps not try, for fear of failing at it. They had laid out ambitious agendas to be sure, on raising income levels or slowing climate change, on enhancing schools and making streets safe. But what if what they were doing wasn’t bold enough? What if their best-intentioned actions didn’t stand a chance against the magnitude of the problems, because the really creative, bold, stand-a-chance solutions were too risky?
And what if that’s the bigger lie? Not the one about bus maps and crowds working or not working, but about what it’s going to take to solve society’s greatest challenges. What if we are all liars, and the lie we are telling ourselves is that our governments can get to where we want them to without experimenting more? Is there anything we can do about that?
It turns out, in the techniques of modern startup companies, there might be. There might be ways to take on riskier efforts, in public, with some tools made popular by entrepreneurs like Eric Ries and others. I didn’t know of Ries’s lean-startup methods when I was in government or when I first heard of Gabriella Gómez-Mont and her lab inside Mexico City’s government. But by the time I was done with my trip to see her there, and certainly when I was in that classroom two years later to talk with mayors about it, I had a better handle on what experimenting in public was, what made it hard, and what might make it easier.
Lab for the City
In Mexico, it started with a question I already knew the nonanswer to. A colleague and I had stopped by the concierge in the hotel where we were staying in Polanco, an upscale portion of Mexico City. The neighborhood sits in the northwestern corner of the giant city known as CDMX. It was also about ten kilometers from the offices of Laboratorio para la Ciudad, the experimental arm of Mexico City’s municipal government at the time.
Laboratorio para la Ciudad (Laboratory for the City, in English) was founded by Gómez-Mont in 2013. She and her team of fewer than a dozen represented the innovation vanguard in a government of three hundred thousand workers. “The hope of Mexico today is reflected in the entrepreneurs,” Mancera had said at the time on his campaign website, and he meant inside government, too. He had tapped Gómez-Mont to create the new city department from scratch.
Gómez-Mont and Mancera were not entirely unique in this ambition. Bloomberg Philanthropies and Nesta, a United Kingdom–based innovation foundation, had described twenty initiatives like this in 2014, from Denmark’s Mindlab to Singapore’s PS21.2 I had cofounded one of them in Boston in 2010, which we called the Mayor’s Office of New Urban Mechanics. The idea, broadly, was to create spaces in government—somehow outside the normal confines of the bureaucracy—in which to try new things. In that respect, these offices weren’t that different from corporate innovation arms like Xerox’s PARC, founded in 1970, or more modern versions like Amazon Lab126.3 Of note is that for all the successful products that emerged from these innovation hubs—Ethernet at Xerox or the Kindle at Amazon—the overall record on these things was decidedly mixed.4 As it turns out, business, not just government, struggles to invent new products and services, too. Nevertheless, by 2018, more government “labs” had come to prominence, including San Francisco’s Office of Civic Innovation and New York City’s NYCx. And along with government digital services, which we will come to in chapter 6, the proliferation of these government innovation arms marked a key organizational shift in the move toward a more inventive public sector. The offices had different setups and different acronyms, but many were drawing to some degree on the startup techniques of private companies. I visited Gómez-Mont in Mexico to see these tools up close and try to gauge whether I thought they were fit for public stuff.
“How do we get there by bus?” we asked the concierge. After some back-and-forth about whether we really wanted to take the bus, and wouldn’t we rather go by car, he confessed that there really was no comprehensive bus map to consult.
I already knew that.
Gómez-Mont’s lab had run many experiments by the time I was bothering the hotel’s staff for an answer I knew they didn’t have. The experiment to create a bus map for Mexico City was the lab’s fiftieth. Thirty thousand public buses, minibuses, and vans made up the Mexico City bus system, and there was no comprehensive schedule.
Gómez-Mont and her team had set out to change that. In the spring of 2015, they’d decided to try to create a bus map by inviting citizens to voluntarily ride Mexico City’s bus routes and record them on their smartphones. It was an approach that had been tried in some developing countries but never in a developed city the size of Mexico’s capital.
There were more conventional alternatives. The city could have purchased GPS technology for the buses, though that would have cost an estimated 60 million pesos ($3 million). It could have passed a law requiring the operators to record the information of the routes they drove, though that raised the prospect of a strike by the operator unions. City agencies could have hired people to ride routes and write down route information, but the LabCDMX team, as Gómez-Mont’s outfit was called, estimated that that would take up to two years to complete.
The wild idea to instead crowdsource a bus map for Mexico City picks up on some of the techniques and attitudes in chapter 2 on ideas. Gómez-Mont had formed a team that included artists and designers, that welcomed in outsiders, and that held mini-hackathons. In their approach, you could sense art mixed in with the science of it all. The lab called its initial forays “provocations.” Art and social progress had been intertwined for generations in Mexico City. Gómez-Mont and her team were coming full circle to some extent.
But as we walked away from the concierge—and took a car to see Gómez-Mont—I was most interested not in how the idea had come to be but rather how the project had unfolded. “A thousand things could have gone wrong . . .” Gómez-Mont told me, as she started to recount.
Probability versus Possibility
If Probability Government and Possibility Government were twin siblings, raised on a set of common ideas (that outcomes supersede politics, that data matters), this is where they decide to go their own way.
Probability Government preaches the gospel of prudence. Of what’s worked before. Of best practices. Probability Government hates risk. Probability Government really hates that “a thousand things could go wrong.” I’ve asked government groups how their organizations normally proceed in the face of the kinds of uncertainties Gómez-Mont was anticipating, and someone always says, “They don’t.”
Probability Government picks sure bets. Or at least surer bets. It picks the approaches that will probably work. If we don’t stop to think about it very long, or perhaps even if we do, this seems to be how government should be and what government should do. Government is often the backstop in citizens’ lives; a thing they need to be able to rely on, for food or for safety. Government is paying with taxpayer money, not funds we expect to be gambled away. We expect government to spend time and money on things that will work.
This is the vision of government that people have in mind when they label government risk averse. We throw those words around a lot, but what risk aversion actually means could be the key to the problem with Probability Government, to the possibility of Possibility Government, and to solving big problems.
Individuals who are risk averse prefer more-certain outcomes over uncertain outcomes, even when the uncertain outcomes have higher expected returns. A risk-neutral person will be indifferent if you offer her $50 or if you offer her a 50 percent chance at winning either $100 or $0. The expected value of both is identical. A risk-averse person will prefer the $50 guaranteed. And depending on how risk averse a person is, she might prefer that $50 guaranteed even if you raise her potential prize: say, a 50 percent chance at $120, which has an expected value of $60. But if she’s someone who strongly prefers certainty to risk, she would gladly give up the chance at an extra $10 for the surety of the $50. When people say government is risk averse, this is presumably the kind of thinking they have in mind.
Are they correct? I went looking for evidence to the contrary. I had witnessed episodes that gave rise to the worst versions of this stereotype: public workers who are just showing up for the paycheck or government officials who are just biding time until they can collect their pensions. The last, absolute last thing they would ever do is take a risk on anything that would put their jobs or their retirement in jeopardy. I remember the day a city worker stepped into the elevator on the first floor of Boston’s city hall and explained to a friend, who had inquired about how his day was going, that he planned to read the paper and count the days until his retirement. But my own experience gave me hope that this was the exception and not the rule. I had mostly seen brave and bold public servants, who genuinely wanted to help people, who prioritized citizen needs above their own risk-mitigation strategies.
The data backs up the stereotype. Not the worst version of it, but the version where public workers are less likely to take chances. Two economists who looked at the question of risk aversion in the public sector in the 1980s found that people perceived, accurately, that government jobs were more stable than private-sector ones.5 They found, further, that individuals who placed a heavy emphasis on job stability were more likely to seek employment in the public sector. A more recent study lent support to this notion, finding that people holding public-sector jobs, as compared with people in private-sector roles, are even less likely to buy lottery tickets than they are gift certificates.6
A thousand things could have gone wrong with Gómez-Mont’s mapping experiment. So why did she proceed anyway? Either she had a different attitude toward risk than typical government workers do. Or she had a plan for reducing some of that risk.
A Mapping Marathon
Gómez-Mont started our conversation by naming some of the thousand things: “The politics might not work. Maybe we won’t get interest from our colleagues within government. We might not have the internal capacity—the know-how—that is needed to get this off the ground. Our algorithm might not work. The gamification might not work. We might not be able to make the data actionable . . . Our backend could have failed. The dashboard could have gone horribly wrong. If no people jumped on board, we would have had nothing. If they did jump on board all at the same time, they could have crashed it.”
She told me how they proceeded anyway. They started over a four-day period in May 2015 using a route-tracking app Gómez-Mont and her team knew probably wouldn’t be the solution for Mexico City. If she was worried about a thousand things going wrong, picking an off-the-shelf app she was sure wasn’t going to carry the load seemed like an odd choice. Twenty-eight riders participated and mapped 18 routes, but the app drained the life of the cell phones. LabCDMX decided to build its own.
The team made a second go at it in August that year. This time they recruited high school students as mappers. Thirty of them mapped 56 routes. The app LabCDMX built with collaborators functioned imperfectly, and the collected data was full of errors.
A third attempt was made in October, this time with university students. Nearly two hundred of them mapped 248 routes. There were still some errors in the data collection, using an enhanced app, but the team collected some useful feedback about which rewards for participating (cash? prizes?) were most desirable. A fourth version, in November, involved the participation of bus drivers.
On January 29, 2016, Gómez-Mont and her team launched Mapatón’s finale. More than 3,600 riders took part. In all, the series of events resulted in 648 route maps. One participant even spent more than nine hours riding a bus to the outskirts of Mexico City and back. Stretched end to end, the mapped routes would reach all the way around the globe. They covered 43 percent of what the team thought were 1,500 bus routes across the city.
We know by now what the hundreds of people I’ve polled thought of this outcome. Approximately half thought it was a success. The other half didn’t, but didn’t want to say so. What did Gómez-Mont take away from it all? She told me, “We need to be thinking much more about lean methodology.”
Build-Measure-Learn
“Lean” came of age in operations and manufacturing. I find it helpful to mention so at the outset, because it warns us away from thinking of the concept as a techie thing for techie people, in case we don’t view ourselves that way. In the operations and manufacturing context, lean meant minimizing waste and reducing errors on the assembly line in order to achieve manufacturing excellence and higher productivity.
In the context Gómez-Mont mentioned, lean means something very similar. No doubt she picked it up from Eric Ries or one of his acolytes. Ries had been applying it in a more tech-oriented context, often for startups.7 The idea was still about minimizing waste. But now it was about minimizing waste as companies fought their way to find products for markets and markets for their new products. The idea was to learn as much as possible while spending the least amount of time, energy, and treasure.
Ries wrote about the concept in his bestselling book The Lean Startup, alerting a generation of would-be entrepreneurs to the idea of “minimally viable products” and “pivots.” He also wrote about it with Tom Eisenmann, a colleague of mine. The first time I really dug into lean-startup techniques, it was for reasons as far removed from government as possible. I learned about lean when it came to renting ball gowns.
Eisenmann had written a Harvard Business School case on two graduates of the school who had set out to create a dress-rental company.8 Jenny Fleiss and Jenn Hyman figured there was a better alternative for women than buying an expensive gown they would wear on only one or two occasions. The two created Rent the Runway to rent dresses to women instead. First, the founders held a trunk show event at Harvard College, using dresses they had bought off the shelf. Later, they ran another event at Yale, this time with only dress swatches. Then came the PDF brochure they circulated, asking people to call a number to rent dresses. Then, and only then, did they really build out a dress-rental platform and the company to go with it.
And in there, in the story of trunk shows and dress styles, is the essence of the lean startup, of what Gómez-Mont was trying to do in Mexico City, and of what might be the key to Possibility Government. To allow government to try things that probably won’t work.
When Hyman and Fleiss started out, building a dress-rental company was probably going to fail. Not just because back then the idea of renting dresses evoked ill-fitting tuxedos and soiled celebratory wear more than it did the newfound interest in sharing assets. Rent the Runway was probably going to fail because most startups fail. Even most startups that receive venture money probably won’t succeed. First-time founders who raise venture capital fail more than 80 percent of the time. Those who buck the odds and succeed still fail on their sophomore effort 70 percent of the time.9
Rent the Runway’s cofounders didn’t try to plan or study their way out of this probability. Instead, they tested their way out. The Harvard trunk show, limited in numbers though it was, proved to them that women would rent dresses. It also showed them that women wanted different styles. The Yale event showed them that women would rent dresses they couldn’t try on. The PDF demonstrated that women would rent dresses they couldn’t try on, over an electronic platform. Each test resolved a key uncertainty facing the business. And each test did so relatively quickly and with relatively low investment.
Ries made the name for these kinds of testable prototypes—minimally viable products (MVPs)—globally famous. And he made the changes that naturally followed, pivots, common startup parlance. He, Eisenmann, and Sarah Dillard wrote up a handful of other examples that make the concepts clear.10 When Drew Houston started Dropbox, he didn’t start by building a complicated offline-backup-sync-and-share software but rather by creating a video about it and inviting people to sign up for it when it was developed. He answered the question about whether people wanted yet another storage solution (it was perhaps the eightieth on the market at the time) without first having to build one. When the team behind Aardvark set out to build an app that let you text in a question to be selectively broadcast to your social network for answers, they didn’t build the complicated algorithm to do that. They simply hired humans to mine the network in place of technology, for the time being. They tried to answer the question of whether people would use a social-network-question-answering app before building the whole kit and caboodle.
Once I started to understand these approaches, I started to see them in other places. Including government. When President Barack Obama’s administration launched Data.gov, the US open-data platform, in 2009, one of the first of its kind in the world, the platform had only forty-seven data sets.11 To put that in perspective, there are fifteen cabinet-level agencies in the US government, and forty-seven barely allowed three data sets for each of them. Imagine how difficult it must have been for the team at the time to turn data sets (and the officials championing them) away. Don’t you want to launch with a relatively complete set of US data? With something robust? I marvel at the fact that it must have been even more difficult to send out the president to announce this, when a guy named Robert from New Jersey created his own version of the site with more data than the government’s official one.12 Why do that? Why start with something incomplete, with something less than Robert can put together in his basement? For the same reason Hyman and Fleiss started with just forty students and off-the-shelf dresses. The notion of the lean startup was to learn the most—about what customers wanted, how they would use products, how you would deliver products to them—while spending the least. To try, to test, and to learn. Would citizens seek out the data? Would they be able to access it technically? Would the data they did access be kept up to date by agencies? Would the data be used to learn anything fruitful about government or to create new private businesses? To answer the early questions that the Data.gov team had didn’t at first require building out the entire website with 260,000 data sets (as it has now) but rather building out just enough to put in citizens’ hands to get some valid feedback.
Reflected back through the lens of Ries, of Eisenmann, of Rent the Runway, and of Data.gov, Gómez-Mont’s Mapatón made a lot more sense to me. Could the LabCDMX team accurately collect information on formal and informal stops? That was the question answered by the May experiment. Could outsiders successfully contribute? August answered that. Could a game motivate engagement? Would cash prizes work? October. Could the collected data be leveraged and made useful? January’s experiment and time would tell. Run all the tests at once, and if the project failed, the team would never specifically know why. But run them separately, and quickly, and the team learned something each time, and had a chance to iterate and to improve.
This is the process Ries calls lean startup. It’s what he, Eisenmann, and Dillard called “hypothesis-driven entrepreneurship” in the paper that introduced me to the process. Perhaps its simplest formulation is build, measure, learn. Laid out like that, for government, the simplicity of the formulation can hide just how fundamentally it upends the way most of us do what we do (see figure 3-1).
In government, build usually comes last. And if you listen to the groups I have polled about how we normally manage risks like those Gómez-Mont faced, build usually comes after the consultants, the commissions, the conference rooms, the request for proposals, etc. Build comes after all the apparatus of Probability Government. But in the startup model, it comes first.
A PayPal Account and a Post Office Box
One other thing looked different to me in light of build, measure, learn, and it was what came after the bombs had blown up at the marathon’s finish line. “You can’t start something new,” the foundation head had told me. He was probably right that among many things, people would feel skittish about donating to a brand-new fund with no infrastructure and no history. We proceeded anyway, and Mayor Menino insisted we have the fund up and running by Tuesday night, little more than twenty-four hours after the attacks. We made his deadline, barely, and with only a barebones website. It had one sentence and a PayPal link. That’s it. But that night, the funds starting flowing in. People asked us for the mailing address—which we had neglected to procure or include—so the next morning I opened up a post office box, and we added the address to the website. Four days later, after a manhunt that froze a city and left one terrorist dead and another captured, I rode with the mayor back to where he was staying in Beacon Hill. We watched the president address the nation. I wandered through the Boston Common, where college students had come to sing “God Bless America,” on my way to the post office on Boylston Street, which I had chosen for a bit of solemn poetry. I emptied the box’s overflowing contents into a black bag that I had brought with me.
Would people give to a new fund? With nothing more than a PayPal account and a post office box, we had seemed to answer at least that question. Later, we built out the site, with how to file a claim, how the monies would be distributed, how to share with friends, who had given. Had we built everything first, we would have gotten things wrong. We also would have gotten out of the gate much more slowly and perhaps not before the world’s attention had moved on to another tragedy. I wasn’t thinking build, measure, learn, not in any formal sense, at the time. I don’t know how clearly we were thinking at all. But One Fund Boston serves as a potentially powerful lesson in the virtues of MVPs.
Honesty, Possibly
It’s a myth that most entrepreneurs like risk, and it would be a mistake to think the lesson of Possibility Government is to go out and seek it. The point is, rather, that risk is inherent in doing bold, new things. What build, measure, learn gives us is a way of resolving some of those risks without spending too much public treasure. We start with a set of assumptions about what needs to go right if some new service is to be successful. And we test those assumptions, often one by one. If we can’t validate them, then we stop. And we go spend our time and the taxpayers’ money on something else. In startup lingo, we “perish” the project. If the data comes back and suggests some changes to our assumptions, we pivot. Maybe it’s the same service but for a different constituency. Maybe it’s a different service but for the same constituency. We change some aspects of what we are doing and test again. And if the signs come back all positive, then we persevere.
It would be wrong, on most counts, to think of entrepreneurs as risk seekers. It’s wrong, I think, to think of Gómez-Mont that way. She likely had more of an appetite for risk than that city worker in the elevator riding toward retirement, but that doesn’t mean she was actively seeking it. It’s not like “a thousand things could have gone wrong” was a magnet for her. By that logic, I suppose “two thousand things could have gone wrong” would have been even better. What allowed her, then, to go after possibility instead of probability must have been some strategy she had for reducing her risks—to get from “a thousand” potential failure points to something less than that.
Build, measure, learn also gives us, if we will take it, a way to stop lying. When things don’t work out, public officials should be able to say, “We ran a test. We ran it without spending too much time and money. The test proved that either our idea or our execution was wrong. We’ll learn from the failure and move on.”
One public leader pushed back on me when I proposed this route. “It’s not your name in the newspaper every day,” she said, polite not to call me naïve and too removed from the action to know better. She also might have been right to proceed more cautiously. I know the internal narrative that gets us to where she got: If I make a mistake and admit it, the press will pillory me. The public will doubt me. My opponents will challenge me. My allies will desert me.
I think the opposite may be true, though. I believe that if we pointed out our missteps, it would be better than having the press uncover them. With trust in public officials where it is today, it is quite possible that actual honesty could increase public faith. One study illustrated that operational transparency by government increases public trust in government and engagement with it. The paper was called “Surfacing the Submerged State.”13 We should surface the experimental state, too. We’d start not by saying we want to take on more risk. We don’t. What we would say is that the way we mostly build now, planning for years on end and then delivering programs or services that fall flat with the public, is high risk, too. And that there is a better way instead. That we can buy down the risk of trying new things, if we build, measure, and learn.