Chapter 5

Open Data

Thomas Jefferson was something of a weather nut. Well prior to 1776, he made careful observations and kept meticulous records of meteorological events at assorted Virginia locations, especially at his Monticello residence, where his routine included two weather records per day—once when he woke at dawn, often the coldest time of the day, the other between 3 and 4 p.m. when he had found it to be the hottest.1 In July of that year, he was in Pennsylvania, for a rather momentous occasion in American history: the signing of a document he helped craft, the Declaration of Independence. Yet he still found time to squeeze in a visit to a nearby merchant to purchase a thermometer and then document Philadelphia’s morning, afternoon, and evening temperatures in his diary. For the record, at 1 p.m. on July 4, Philadelphia’s conditions were clear and mild, with a high of 76 degrees.2

Jefferson later took his weather passion overseas. While Minister to France from 1785 to 1789, he sought the most accurate instruments available, to compare readings with those in his home country, and validate the viability of America’s agrarian ambitions. According to the Thomas Jefferson Encyclopedia, “His initially patriotic motive was to topple one of the two ‘pillars’ of the theory of degeneracy of animal life in America advanced by the Comte de Buffon and other European scholars—America’s alleged excessive humidity. Also in his role as champion of the North American continent, Jefferson began in Paris to compile a record of the ratio of cloudy to sunny skies. After a five-year residence in France, he had proved to himself that America completely eclipsed Europe in the sunshine contest and he appreciated more than ever the ‘cheerful,’ sunny climate of his native country.”

After winning his first of two Presidential terms in 1801, Jefferson incorporated weather observation into official government duties. He did so most famously with the Lewis and Clark Expedition in 1804, instructing the western-bound explorers to observe “climate as characterized by the thermometer, by the proportion of rainy, cloudy & clear days, by lightening, hail, snow, ice, by the access & recess of frost, by the winds prevailing at different seasons . . .”

Over the next few decades, other leaders began to promote greater weather data collection. During the War of 1812, James Tilton, the Surgeon General of the Army, called for his fellow hospital surgeons to do so. During the 1830s and 1840s, James Pollard Espy, a mete­orologist known as the Storm King, secured state (Pennsylvania) and federal funding to expand and equip a network of weather volunteers. In 1849, Joseph Henry, the first Secretary of the Smithsonian Institution, found a way to capitalize on exciting new technology: the commercial telegraph, which had been introduced four years earlier.3 Henry established a novel public-private partnership, convincing many telegraph companies to waive transmission fees on all weather recordings sent to the Smithsonian’s forecasters for analysis. Within a decade, after receiving the cooperation of telegraph stations from New York to New Orleans, Henry was able to produce a daily weather map for nearly 20 cities, and share it for publication in the Washington Evening Star.

Still, the federal government remained small, and there was no formal weather agency at the start of the Civil War. Nor was there one by its end, even as government was growing some in size—in 1862, the Department of Agriculture was founded, and from 1861 through the end of the decade, the number of federal workers nearly tripled.4 Then, on February 9, 1870, several months before the founding of a more heralded agency—the Department of Justice was formed to handle the postwar surge in litigation—President Ulysses S. Grant signed Congress’ joint resolution to create the Division of Telegrams and Reports for the Benefit of Commerce.5 That agency, operating under the U.S. Army Signal Corps, was charged with making meteorological observations at military stations “and other points in the States and Territories,” while using magnetic telegraph and marine signals to give notice “of the approach and force of storms.” Two decades later, President Benjamin Harrison initiated the transfer of weather-related duties to the Department of Agriculture on the civilian side of government.6 Through the passage of the National Weather Service Organic Act of 1890, the new U.S. Weather Bureau was directed not only to collect data but to “publicly distribute weather information useful to sectors of the nation’s agriculture, communications, commerce, and navigation interests.”7

That mandate and mission of openness remained consistent throughout the next several decades, even after the U.S. Weather Bureau’s move to the Commerce Department in 1940, and even as advancements in technology transformed the means of weather collection. The evolution in data-gathering instruments can be traced from kites (late nineteenth century) to upper air balloons (early twentieth century) to reconnaissance airplanes (1920s), all the way to radar and satellites—the remote sensing tools that were initially developed to serve the military but co-opted by weather data collectors in the late 1950s and early 1960s, respectively.

In 1970, thanks to President Richard Nixon’s order, the U.S. Weather Bureau was renamed the National Weather Service (NWS) and moved to another new home—the Commerce Department’s National Oceanic and Atmospheric Administration (NOAA).8 NOAA would eventually become a behemoth, one now with an annual budget in excess of $5 billion, much of that utilized by the NWS to deploy weather-monitoring satellites, run its national and regional centers, and run its local forecast offices.9 As that has occurred, a robust private weather industry has emerged alongside, using the government data but providing greater customization and analysis in forecasting—for the likes of airlines, utilities, and agriculture firms—than the NWS, with its limited staffing, can always offer. For instance, since 1976, New York–based CompuWeather, with its staff of certified meteorologists, has provided forensic work to document past weather events for the purpose of insurance companies and attorneys validating claims, while also providing forecasts for film production companies so they know where and when to shoot.10

These two growing enterprises have benefited from refinements in policy, to reduce the natural tension between them. In 1978, NOAA adopted a policy that it would not “provide specialized services for business or industry when the services are currently offered or can be offered by a commercial enterprise.”11 Over the next three decades, the agency would further clarify the parameters of its partnership with the private sector, while endeavoring to make forecasting more relevant and precise. In 2004, it tweaked its data format from a proprietary one to an open Internet standard called XML, lowering barriers to entry and thus expanding the base of potential participants.12 Now, more than 50 percent of the members of the American Meteorological Society are employed in the private sector.13

Through it all, NOAA’s mission has remained the same: “making data easy and affordable . . . creating a more informed public, providing unbiased information . . . and giving the commercial weather industry an opportunity to flourish.”

No entity has flourished quite like The Weather Channel. Incorporated in 1980 and launched in 1982 as a cable network to disrupt the more traditional model of accessing weather data since the 1950s—which was through your local television station’s news broadcast—The Weather Channel’s strategy was built on a simple understanding. As current CEO David Kenny put it, “Weather is personal, weather is very local.” Most people wanted to know what the weather was and would be where they lived, not necessarily anywhere else. They also wanted to know it consistently and frequently, without waiting for the local news at six or 11. That required taking the available government data, sorting it by ZIP code, and delivering it via the company’s innovative Satellite Transmission and Receiving System (STARS), to homes with cable systems throughout America. It also meant doing it on a steady clock. “There wouldn’t be a Weather Channel without localization every 10 minutes,” Kenny said. While there have been many STARS upgrades and generations since, local updates have remained a staple, known since 1996 as Local on the 8s.

“We tell the story better, we make it more useful, we make it more relevant, and we add value to the science,” Kenny noted.

That’s added value to the company. According to a Magid study that was quoted in AdWeek, The Weather Channel had “150 million unduplicated consumers across TV, online and mobile” as of the fourth quarter of 2011, and it is still the undisputed leader in a $1.5 billion industry. Kenny acknowledges that this success never could have occurred without the government’s open data policy. If the company had been responsible for all of the infrastructure costs, notably the necessary recording equipment, rather than merely leveraging the available government data, the channel’s launch would have been long delayed, and most certainly cost-prohibitive. It might have never gotten off the ground. But there’s more to it: once The Weather Channel took off on cable, the continued access to that data, at a nominal cost, left the company with sufficient resources for additional innovation in the years to come. One innovation was Weather.com, which grants users reports and forecasts, in real time, for some 100,000 locations. Then, as newer technologies provided opportunities, the company was able to create a mobile application, which, in 2013, Apple ranked as the second-most downloaded application on its iPad and the seventh-most on the iPhone.14 Now Weather.com even produces weather sections for local broadcasters’ mobile apps. And, in the winter of 2013, The Weather Channel took Local on the 8s to the next level, launching a continuous, real-time scroll of local conditions, making even the television set appear more like a mobile app.

“All of this comes from open data,” Kenny said.

The Weather Channel doesn’t just take. It gives back. Every year it voluntarily communicates more than 100,000 National Weather Service alerts to its audiences, to assist the agency in its public safety mission. Its team of scientists puts many models and findings back in open source for meteorologists at the NWS (and throughout NOAA) to study, apply, and send back for further conversation, adaptation, and application.

“There’s a whole meteorological community,” Kenny said. “It’s always been this sort of mutual mission of science that’s kept us going. So I think this is a living thing. It’s not just that they post the data, and we repurpose it. We think it’s important that there be continuing collaboration.”

Both sides—public and private—share knowledge and opinions in times of major weather disturbances, or even in the aftermath, in order to identify areas for improvement. Such reviews occurred after the Eyjafjallajökull volcano in Iceland paralyzed European air traffic in 2010. A Weather Channel subsidiary met with the Met Office in the United Kingdom as well as the London Volcanic Ash Advisory Center to exchange, blend, and align forecast techniques and practices. The result was that, when another Icelandic volcano erupted a year later, the impact on air traffic was considerably less.

As technology advances, more knowledge can be gained—even about which way the wind will blow. The Weather Channel’s latest initiatives speak to what’s possible at the intersection of technology, open energy, and weather data. It has long been known that the clean energy sector, wind farms more specifically, could dramatically improve productivity and profitability by optimizing how to angle wind blades based on real-time changes in weather. Historically, however, this has not been possible, because the changes in wind patterns simply occur too quickly. But now, by using new technologies such as cloud computing, The Weather Channel is able to fully exploit the data, by creating algorithms that better predict those patterns. Late in 2012, Kenny said The Weather Channel was already selling that product inside and outside the United States as a pilot and was planning to release version 2.0 in 2014, in the hopes of aligning data, decisions, and energy markets so they could move as fast as the wind.

“We’re crossing new thresholds in terms of data and the ability to manage ‘Big Data,” Kenny said. “It may not have been useful to release it in the past, but it is incredibly useful today.”

More than useful. Necessary. That’s why other governments and businesses worldwide call upon The Weather Channel to share its data assimilation and computing models; the general mission is the same for all of those entities, real-time data, wherever and whenever, even if the immediate plans for that data are different.

“What’s clear to me is the nations that figure out how to use their data and share their data and put it in the grid give their businesses and citizens a leg up versus nations that don’t,” Kenny said. “If I look at the disparity in weather information that’s given to African farmers and farmers in Kansas, it’s huge. It makes a difference. But that gap will change and close in time as data becomes available. And you compete on the basis of data and information as a nation.”

Kenny deems it much too early to declare a winner on wind, even as some countries—such as those in Scandinavia—have integrated it heavily into their policy. He is certain, however, that data will be critical to making any such policy work.

“People pay us a price for our interpretation of free data, because we interpret it in a better way,” Kenny said. “But at the core of it, data collection and data availability and speed in the use of modern technology will increasingly create competitive nations. And nations that don’t necessarily have natural resources to compete upon can change their competitiveness by the way they use information and provide it.”

When we tell the open data story, we’re not just talking about the weather.

Even prior to our nation’s founding, governments have been collecting data on the population here, through surveys. The British government did so to count the number of people in the colonies in the early seventeenth century. And, under the direction of then-Secretary of State Thomas Jefferson, the U.S. federal government took its first official census in 1790, with another occurring every 10 years since.

This has become an extensive and expensive enterprise: the 23rd census, conducted in 2010, came in under budget and still cost roughly $13.1 billion.15 Through all of those decades, government agencies have collected additional data for a host of purposes—­holding regulated entities accountable, conducting research on key social and economic trends, processing individual benefits, and so forth. The methods of data analysis have evolved over that time as well. In 1886, an employee of the U.S. Census Office named Herman Hollerith invented an electrical punch-card reader that could be used to process information; a decade later, Hollerith formed the Tabulating Machine Company, which in 1924 became International Business Machines (IBM). One of his colleagues, James Powers, developed his own card-punching technology and founded his own company, the Powers Tabulating Machine Company, which merged with Remington Rand in 1927. For decades IBM and Remington Rand, tracing their ancestry in part to innovative government employees, dominated the developing computer industry.16

As the government collected all of this data, the public developed a greater desire to access it. The enactment of freedom of information laws has allowed the public to make specific requests through a federal agency, with those requests subject to a number of exclusions and exemptions. We might not have these laws at all if not for the long-standing advocacy of the newspaper industry, as well as the yeoman efforts of John Moss.17 The California Congressman championed transparency measures in the 1950s and 1960s in response to a series of secrecy proposals during the Cold War. Moss encountered sustained, stubborn resistance from both parties, but eventually persuaded enough members, including Republican Congressman (and future Defense Secretary) Donald Rumsfeld, to become allies. In 1966, they got a bill to the desk of a long-time opponent, President Lyndon B. Johnson. Johnson did sign it, along with a statement that he had “a deep sense of pride that the United States is an open society,” even as the statement also focused on all of the exemptions for national security. Over the next two decades, and in response to events such as the Watergate scandal, Congress would amend and strengthen the bill, and it remains the law of the land.

Even so, the Freedom of Information Act (FOIA) has had its limitations.18 It has frequently resulted in needless delay and work, the latter on account of the information’s release in inaccessible formats. The agencies responsible for collecting data have conceived their systems with their own needs in mind, so that they could use that data for assorted, internal government functions. They haven’t given as much thought to making the output of that data easier, in ways that could allow the public to best reuse it. That wasn’t an ill-intentioned stab at secrecy; it simply wasn’t seen as a requirement or priority of government.

Open innovators see data quite differently. They see it as something that should be available not by request but by default in computer-friendly, easily-understandable form. They see it as the igniter of a twenty-first-century economy that can expand industries and better lives.

They see it the way Todd Park does.

I became aware of Park’s unique perspective and ability while serving on the Obama transition team in 2008. Park, the cofounder of athenahealth—a managed web-based service to help doctors collect more of their billings—served as an invaluable informal adviser for what would later become the HITECH Act, an element of the Recovery Act that offered doctors and hospitals more than $26 billion in incentive payments for the adoption and “meaningful use” of health IT. He had no designs on joining the government when he took an interview with Bill Corr, the Deputy Secretary of Health and Human Services (HHS), for the agency’s new CTO position. He intended to steer Corr toward more appropriate candidates, those with plentiful—heck, some—public sector experience. Corr told Park, however, that he had enough people who knew government well, and that his preference was to “cross-pollinate” their DNA with Park’s, so “the DNA of the entrepreneur embeds itself in HHS through this role.”

Corr referenced the President’s call for a more transparent, participatory, and collaborative government. To demonstrate how it would manifest itself through this position, Corr touted his department’s access to vaults upon vaults of incredibly valuable data that could, in the hands of a more innovative and engaged public, better advance the mission of the agency. Then, with Park’s interest piqued, the brainstorming began. “And the notion of actually working on how to leverage the HHS data for maximum public benefit was the thing that really made the role of tech entrepreneur concrete to me,” Park said. “And that’s what convinced me to talk my poor wife into agreeing to move across the country and jump into workaholic mode again and do this job.”

After his appointment, Park explored his own sprawling agency, one with an $80 billion annual budget and 11 distinct operating divisions, including the National Institutes of Health, the R&D engine of the biotech industry; the Centers for Medicare & Medicaid Services, which provides health insurance for more than 50 million Americans; and the Food and Drug Administration, which protects the public safety. And in that research, he uncovered not only the data sets Corr had highlighted but also champions within the civil service, looking for a leader in the cause. This informal activity got a more formal boost with the White House’s delivery of the Open Government Directive. As related in previous chapters, that directive provided explicit instructions and deadlines for culture changes within departments and agencies. Within 45 days, Park published four high-value data sets, one more than required by the White House directive. None had been available online or in a downloadable format.19 One of them, the Medicare Part B National Summary data file (representing payments to doctors) had previously been available only on CD-ROM and for a $100 charge per year of data. Now, that was available for free and without intellectual property restraints, and the same was true for the other three sets Park published: the FDA’s animal drug product directory; the compendium of Medicare hearings and appeals; and the list of NIH-funded grants, research, and products. By the 60th day, Park had launched an open government web page, inviting the public to comment on which data sets should be made more accessible and to offer input about each agency’s overall open government plan.

Yet it wasn’t just what Park was doing. It was the way he was thinking, a way that would later lead me, while grading agency performance, to point to his HHS team as a model for other agencies to replicate.20 He understood, better than anyone, that data alone wouldn’t close the gap between the American people and their government. Rather, true change would come from the improved use of that data in the furtherance of a personal goal, such as finding the right doctor; understanding the latest research on a patient’s condition; or learning of the most recent recall of a food or medical product that could jeopardize a loved one’s health.

Initially, Park did what other officials were doing in their own agencies, methodically inventorying and publishing additional data sets. Then he turned his attention to simplifying public access to that data and encouraging its use. After some investigation, he determined that, while there was little harm in the government creating some of those tools, the “real play” came in engaging outside entrepreneurs and innovators. The trick was not in dictating the next step, but in allowing “everyone else in the universe to actually tap into the data to build all kinds of tools and services and applications and features that we couldn’t even dream up ourselves, let alone exe­cute and grow to scale.” He believed the subsequent development of simple, engaging, impactful tools would result in improving the health care delivery system.

To test and prove that thesis, Park partnered with the Institute of Medicine, a wing of the National Academy of Sciences, to host the Health Data Initiative in March 2010. It was a collaboration to spur participation. Together, they convened a contingent of accomplished entrepreneurs and innovators, drawn equally from the worlds of technology and health care, based on a philosophy that Park had drawn from someone we both considered our Obi-Wan Kenobi, the technology thought leader Tim O’Reilly. “If you are going to actually catalyze innovation with data,” Park said, “if you want to build an ecosystem of innovation that leverages the data, you need to engage from the beginning, the people who are actually going to innovate on the data. Ask them: ‘What would be valuable, how should we use the data, how should we improve the data?’ So we brought a group of 45 folks together, and put a pile of data in front of them and said ‘What do you think? What can you use this for?’”

What Park didn’t fully anticipate was that one plus one would equal three. O’Reilly was a legendary figure in technology, as the leading forecaster of the economic boom that would come from social networking, even coining the term Web 2.0. Don Berwick was a legendary figure in health care, thanks to work with the Institute for Healthcare Improvement. They knew nothing about each other, let alone the other’s importance to an entire community. Now, through this Health Data Initiative, they were in the same room, with the same goals. “Because of the fragmentation of society, you don’t necessarily have a lot of broad connectivity of experts,” Park said. “The intersection of O’Reilly and Berwick, and of their followers, was ­really magical. A tremendous source of energy and productivity in the Health Data Initiative and all these initiatives is bringing together the best innovators in health care and the best innovators in tech to do things together with data that neither side alone could have done.”

The data sets available through HHS and other sources were voluminous and varied, rendering the permutations endless. By the end of the full-day session, the group had conceived roughly 20 cate­gories of applications and services that the data could potentially power. Further, all left with a challenge: If they could make their conception a reality, within 90 days and without any government funding, their creation would be showcased at the first-ever Health Datapalooza, hosted by HHS and the Institute of Medicine. On June 2, 2010, they would exhibit more than 20 new or upgraded applications and services that would help patients find the right hospital or improve their health literacy, help doctors provide better care, and help policymakers make better decisions related to public health.

The ideas came from a range of sources. Some came from upstart firms, including MeYou Health. Its lightweight Community Clash card game, aimed at creating awareness of health factors in a user’s community as compared to others, drew some of the longest lines.

Others originated from established powerhouses such as Google, which spotted value in one set of HHS data, quality measures for every hospital in America, posted since 2005 on a website (hospitalcompare.hhs.gov). Google’s Chief Health Strategist, Dr. Roni Zeiger, saw the opportunity to bring this information to life through a more journalistic, provocative approach, one that would attract more eyeballs and influence more decisions.21 For instance, he asked: Where in New York City should a patient with chest pain seek care? The city has an abundance of world-renowned medical centers, including one that President Bill Clinton had chosen for his heart operation. While most of us rely on anecdotal advice from our doctors, friends, and neighbors when selecting a hospital for life-­saving treatment, Dr. Zeiger demonstrated the potential of relying more on empirical evidence. He did it quickly and at little marginal cost, downloading the national HHS file freely available in ­computer-friendly form and uploading it to a Google cloud-based tool called Fusion Tables, a free service that simplifies a user’s ability to visualize, manipulate, or share data. He then selected roughly half-a-dozen measures, from heart failure mortality rates (within 30 days) to clinical statistics such as satisfaction surveys, such as whether a patient got a quiet room, zooming in on results in the New York City area. Then he published a screen shot of a map on his blog, with hospitals clearly marked, next to their corresponding “heart-friendly” and “­patient-friendly” scores that he had derived from the data.

“It was just the beginning,” Park said of the 2010 inaugural event.

Over the next couple of years, the Datapalooza earned must-­attend status among health care innovators, with the audience increasing more than fivefold and the number of submissions increasing more than tenfold, with 242 companies and nonprofits competing fiercely for 100 showcase spots. The growth was reflected throughout the participant spectrum, with mature and emerging firms, student teams, and even celebrities all presenting their prototypes that had the dual benefit of advancing the HHS mission.

Aetna, a mature firm looking to offer a more personalized experience for customers calling in to its nurse call center, designed an IT cockpit. When a customer called a nurse, a series of applications opened on his or her screen, providing location-specific government data—related to everything from environmental factors to quality measures—in order to guide advice. In this way, a patient discharged from a hospital in Georgia could get tailored assistance from a nurse in Ohio, from the booking of appointments at the best place to seek treatment to the latest evidence from the National Institutes of Health on managing the condition. “This helps the patient use a bunch of public information, but does so through one of the oldest and most effective user interfaces ever designed, which is called, ‘Talking to another human being,’” Park said. “The point is that in the open data revolution, the innovations happening on top of open data are about much, much, much more than the apps. The apps are real, but you also have information in rich human services.”22

The startup Healthagen was intent on reducing unnecessary emergency room visits which, according to the nonprofit National Quality Forum, waste nearly $40 billion per year. So it added a new dimension to its already-popular iTriage smartphone application. That application, created by two emergency room doctors, had allowed users to input information about their symptoms, read about possible remedies, and learn whether they could seek care for the condition at a lower cost, outside of the hospital. The firm’s new iteration leveraged the smartphone’s Global Positioning System (GPS) capability to offer a local list of health centers subsidized by the government for lower-cost care. It even allowed users to book appointments. Already, iTriage has been downloaded nearly 10 million times in more than 80 countries and, as of this writing, had a 4.5 rating (out of 5) from users on Apple’s app store. “Better yet, there have been testimonials, including ‘this saved my life, because I got help for something I didn’t realize was life threatening,” Park said. Healthagen grew so fast that Aetna acquired the company in December 2011 and continued investing in product improvements. The application recently integrated a Centers for Disease Control data set to improve the symptom analyzer.

Student teams also made contributions, notably a pair of emergency room residents from the Johns Hopkins University. They used data from the Centers for Disease Control’s bio-surveillance program and built Symcat, a more accurate resource for patient self-diagnosis than was available through websites such as WebMD. While WebMD provides high-quality reference medical information, Symcat can—with the assistance of the user providing symptom and family history information—actually estimate the probability of certain conditions. It’s the difference between telling a user what cancer is and telling that user whether there’s a reasonable chance that he or she has it. The application won a $100,000 Robert Wood Johnson prize and catalyzed the formation of a company.

Somehow, though, it was a larger-than-life celebrity, Jon Bon Jovi, who made this movement seem the most real.23

The activist musician took center stage as one of the first speakers at the 2012 Datapalooza.24 Roughly two minutes into an unusually stiff performance, he tossed aside his script as if it were a busted guitar, preferring to riff from the heart.25

Bon Jovi spoke slowly, softly, and passionately about his JBJ Soul Kitchen restaurant in Red Bank, New Jersey, where diners leave donations of whatever they can afford and, if they can’t afford anything, volunteer their labor instead. He then spoke specifically of one man who worked in the kitchen so late one night that Bon Jovi and his wife suspected, correctly, that the man had no place else to go. After a frustrating attempt to find an available bed online, Bon Jovi came to believe that those with limited resources—and a reliance upon public transit—would have virtually no success hunting down comfortable shelter in suburban New Jersey. Nor would anyone who wanted to use the Internet help.

Bon Jovi continued his story by recounting the January 2012 event that he had attended as a member of the President’s Council for Community Solutions. I had attended, too, to announce the tech program associated with the Summer Jobs + initiative. At a bathroom break, we—two guys from New Jersey with quite dissimilar backgrounds—met in the hallway. He asked if he could apply open data concepts to the homelessness issue, and that brief discussion, followed by brainstorming, had led to the Project Reach developer challenge. The challenge called upon the public to use open data from the Departments of Veterans Affairs as well as Housing and Urban Development to address veteran homelessness through an OpenTable-style application that provided information about bed, clothes, food, and medical assistance at and around New Jersey shelters. It had produced five formidable finalists, and a punch line to Bon Jovi’s speech.

“The power of ‘we’ allows us the opportunity to truly make a difference,” he told the Datapalooza audience.

Without question, Bon Jovi’s star power had made a difference, too, as was illustrated when he left the dais to a chorus of applause from the standing-room crowd of 1,500. Still, Bon Jovi couldn’t upstage the man who presented more than an hour earlier, wearing glasses and a suit. In an article about the event, a Forbes.com reporter expressed surprise that Todd Park, not the global superstar, received the really raucous response. But it didn’t surprise me. He hadn’t sold millions of records or starred in more than a dozen films. Park had, however, risen through the ranks to succeed me as the nation’s Chief Technology Officer. And he had energized two communities—in the technology and health care spaces—with his infectious energy and irrepressible passion. Rather than wow with wonk speak, he peppered his presentation with colloquialisms like “awesomeness” and closed his address with “Rock On!” In settings such as this, he was invariably the one who left the crowd calling for encores.

“I just felt like I was incredibly lucky to be able to kick things off with this amazing gathering of people,” Park said months later. “There are many evangelists for the movement, and I felt like, more than anything, I was channeling them. That’s what is so exciting about this. It’s a movement with so many leaders, powered by so much innovation across the board, around the country, people who believe the truly great innovation ecosystems are decentralized, self-propelled, and open. There were many, many, many impressive people at Datapalooza, not quite as famous as Bon Jovi, but who were just as enthusiastic. It was fantastic, really, really awesome.”

I mentioned earlier that Healthagen, in its iTriage application, used GPS as a tool to provide open government data (a list of medical providers) in a more manageable, user-friendly format, based on location. But GPS itself was the product of a series of earlier open government initiatives.

The U.S. military had been tinkering with navigation systems as early as the 1940s, with independent aims and moderate success throughout the next few decades.26 In 1973, the Defense Department designated the Air Force to consolidate the various established concepts into a comprehensive system called the Defense Navigation Satellite System (DNSS).27 The first experimental GPS satellite launched in 1978, with more launched by the mid-1980s, and all available only to the military.

Then, in 1983, a Korean commercial airline, en route from Anchorage, Alaska, to Seoul, mistakenly entered the Soviet Union’s airspace. A Soviet fighter jet shot it down, killing 269 people. To minimize future navigational errors, President Ronald Reagan allowed civilian access to GPS. But that access came with a catch—to protect national security, he imposed a filter that blunted the accuracy, as compared to what was available to the military. President Bill Clinton, an advocate of using GPS for “addressing a broad range of military, civil, commercial, and scientific interests, both national and international” throughout his two terms, took away the restrictions prior to leaving office. On May 1, 2000, he ordered an end to the intentional degrading of GPS accuracy: “The decision to discontinue Selective Availability is the latest measure in an ongoing effort to make GPS more responsive to civil and commercial users worldwide . . . This increase in accuracy will allow new GPS applications to emerge and continue to enhance the lives of people around the world.”

Propelled by the government’s support, more private sector entities began experimenting in this space. Those innovators began offering a variety of commercial applications. Prices for GPS chips fell dramatically, allowing phone carriers to offer navigation as an inexpensive, standard feature in products. And the GPS industry—­requiring roughly $1.3 billion a year from the U.S. Treasury for procuring satellites and furthering systems development—has grown into a $65 billion enterprise.28 That includes an array of smartphone apps helping users find anything from an art museum to an aunt’s house.

In the mid-2000s, Dr. David Van Sickle had a more critical cause in mind.29 While working as a respiratory disease detective in the Epidemic Intelligence Service at the Centers for Disease Control and Prevention (CDC) in Atlanta, he didn’t need to dig much to identify a major problem in the health care system. That was easy as ­breathing—breathing for him, anyway. “People think about asthma, and think we must have a handle on it in the U.S.,” Van Sickle said. “But the grim reality is that most patients’ asthma in this country is uncontrolled. There’s a higher rate of going to the hospital than there should be. We have been doing the same thing about asthma for years, and we have made basically no dent in hospitalizations. The majority of those people think they are doing fine, so no one treats them with a course correction. And, so, there’s inexcusable morbidity. There’s this really ridiculous gap between what we should be able to do and what we’ve been able to accomplish.”

In his view, this has been largely a product of information gaps on both the public health and clinical sides of the equation. During his time at the CDC, including his work examining asthma outbreaks due to mold exposure in the aftermath of 2005’s Hurricane Katrina, he kept coming across the same obstacles: asthma data that was often years old and long outdated by the time he saw it; data that only accounted for deaths and hospitalizations rather than informative events such as school and work absences. Due to these limitations, research at the public health level was often done by “carpet-bombing a community” rather than targeting specific, smaller areas.

These gaps made it nearly impossible to tackle the issue in any productive, proactive, expedited individual way. “You would never have to ask a credit card company to review data on an annual basis,” he said. “But you have to ask public health or health care to do that? This is vastly behind where other industries are.”

Nor was America an outlier. While at the CDC, Van Sickle read about an acute asthma cluster in Barcelona. “It sent a bunch of people to the hospital and a bunch of people died,” he said. “The investigative team finally asked where people were when they were having symptoms. They mapped that, and finally figured out that the filters hadn’t been installed correctly in the harbor silos, which meant that when people were loading soybeans, it created a potent soybean dust. It was the first time we recognized that as a powerful allergen. But it took them ten years to figure out what was happening.”

America certainly doesn’t have that sort of time for delays in discovery, not with its pressing health care cost crisis: those costs are rising sharply and seemingly without end, with an expectation they will far exceed their already-excessive current chunk of the Gross Domestic Product in the United States. According to the World Bank, the U.S. spent 17.9 percent of its GDP on health care, compared to 11.2 percent for Canada, 9.3 percent for the United Kingdom, and 5.2 percent for China.30 There’s a crying need for innovation aimed at greater efficiency, and a focus on preventative measures that will allow patients to avoid factors that could trigger a condition, and thus further strain the system. There’s a need, above all, to empower doctors and patients.

That’s what Van Sickle set out to do after leaving the CDC, armed with a generous fellowship from the Robert Wood Johnson Foundation to serve as a Health and Society Scholar at the University of Wisconsin-Madison.31 “I had this great mandate to do something, to solve a problem that had always been bugging me,” he said.

And he had this great tool, GPS, to use to improve public health. Early during his time in Wisconsin, Van Sickle decided to attach an asthma inhaler to electronics. The resulting device, called a Spiro­scout, created a time and GPS location record of symptoms as the inhaler was used. The onset of those symptoms could be linked to a place—and thus, to the elements of exposure. If the person was using the inhaler more than twice per week, it probably meant an emergency room visit was imminent.

Van Sickle initially built a small batch of those devices, “just to show I wasn’t completely crazy.” He benefited from participants’ understanding that, by sharing information, they might help others avoid symptoms. Still, he attempted to address privacy concerns. “It was done sensibly and protected,” he insisted.

Over time, the devices became more advanced, smaller, and with better battery life.32 He has also changed his vantage point, choosing to come at the problem from the private sector—from “industrial size, not professional size, without everything that is in the way on the academic side.” He started a company, Asthmapolis, to improve asthma management and public health surveillance, striving to lower costs associated with asthma while providing a novel data stream for health improvement. By 2013, his device had earned FDA approval, his hypothesis that information could lower asthma attacks had been validated in testing in North Carolina and Kentucky, and his business had attracted $5 million in venture capital to tackle a market of more than 20 million asthma patients in the United States alone.

Patients with uncontrolled asthma spend thousands more per year than those with controlled asthma. As more health systems enter into population health contracts with insurance companies, taking responsibility for improved outcomes, there is an emerging market incentive to adopt a program such as Van Sickle’s and integrate it into a physician’s regular practice.

“The doctor can take the data from a daily list for the patient, make it meaningful, and get it back to the patient,” Van Sickle said. “Such as, ‘You should not be having symptoms every night. Here’s what is going on with you.’ It’s personalized guidance, personalized education, captured from daily life and put to use.”

At the White House, we saw the importance of Van Sickle’s work. So we invited him to a June 10, 2011, event, and honored him as one of our Champions of Change.33 These weekly gatherings were designed to bring attention to innovators, educators, and builders who, in our view, were “Winning the Future Across America,” starting in their communities. Through this initiative, we came across people from a wide range of backgrounds, but who all had one thing in common: they successfully and creatively moved a cause forward, improving their communities and, by extension, the country. On June 10, our list of champions was narrowed to those who did so through the use of open data.

One of those we honored, Bay Area real estate broker Leigh Budlong—whose Zonability app allowed prospective commercial tenants in San Francisco to understand zoning limits in their area—captured the spirit of the day in an online post: “Whenever I hear people are bummed out by government, I try to tell them about this very cool and seemingly quiet movement underway . . . data is awesome and figuring out how to make it useful to a target audience is the reward.”34

Another champion was a part-time chicken farmer named Waldo Jaquith. As a secondary sidelight from his duties as a webmaster at the University of Virginia’s Miller Center, Jaquith had launched Richmond Sunlight, a volunteer-run site that kept close tabs on the activities of the Virginia legislature, including manually uploading hundreds of hours of video of floor speeches, tagging relevant information on bills and committee votes, and inviting the public to comment on legislation. Jaquith had also earned a Knight Foundation fellowship to convert state government codes across the country into online machine-readable formats; shortly after the Champions ceremony, we would hire him to design Ethics.gov.

Then there was the champion trio of Bob Burbach, Dave Augustine, and Andrew Carpenter.35 While working together at a San Francisco education nonprofit called WestEd, they had stumbled upon an Apps for America 2 contest sponsored by the nonprofit organization Sunlight Labs, requiring the use of a data set from Data.gov. “We wanted to show government that cool things can happen when they make data available,” Burbach said.

During the three-week mad dash to finish, they chose the ­largest-possible data set they could find: all of the content that made up the Federal Register. One of the government’s time-­honored transparency vehicles, created in 1935 by Franklin D. Roosevelt’s signing of the Federal Register Act, the compendium had been designed “to bring order to the core documents of the Exec­utive Branch and make them broadly available to the American public.”36 First published in March 1936, the “government’s daily newspaper” included any government action that, by law, had to be disclosed to the public—everything from Presidential executive orders and proclamations to agency rules and regulations to meeting announcements. The publication was intended to encourage the public’s participation in actions that had been proposed, and fully informing it when such actions became final.

In the decades following the Federal Register’s introduction in 1936, the federal government would expand exponentially in terms of the number of agencies, personnel, and programs—­Social Security, Medicare, and Medicaid were among the notable new entitlements; and the Presidential Cabinet doubled in size from Roosevelt to Obama, partly due to the Departments of Health and Education. In light of this expansion, transparency became even more critical to the populace, to make sure everything was running as it should. But it became even tougher for people to turn to the Federal Register for that assurance, since the compendium’s size and complexity grew in accordance with that of the government. It became, over the next seven decades, a rich, enormous (more than 81,000 pages in 2010), and nearly indecipherable roundup, written by regulators for consumption only by the lawyers of the regulated.

The corresponding website, initially developed in 1994, turned out to be even less accessible, due to its aesthetics. It was little more than a PDF version of the complex printed edition, and that PDF had serious problems. It was virtually unreadable, in part because agencies had to pay by the page to get content included, so they went to comical extents to shrink that content, often forgoing paragraph breaks. That site was the only means of access for people who weren’t inclined to pay nearly $1,000 for an annual subscription to the hard copy, or didn’t have the time to travel to a law library and sift through hundreds of pages, simply to find the one thing that impacted them. These challenges in accessing useful information, whether through the print or online mediums, contributed to well-heeled lobbyists knowing infinitely more about what was happening in Washington than the general public ever could. Certainly, the lobbyists didn’t mind, since they could command thousands-per-hour fees for the dispersal of that knowledge to billion-dollar industries. So much for helping Regular Joe from Idaho.

When Burbach and his team first encountered what he called “the tons and tons of data” that made up the Federal Register, that data struck them as extremely meaningful: information that people could use not only to know more about how government was running, but also so they could better run their businesses without running roughshod over regulations.

But it was impossible for the technology-savvy trio, let alone the average citizen, to understand much of it. Burbach, Carpenter, and Augustine couldn’t change that complex regulatory content—and, as neophytes to government, they wouldn’t have known how. But that inexperience, in some ways, was actually an advantage. They came to the task without preconceptions of how the data should be presented, but they brought their ­expertise—drawn from their knowledge of the consumer Internet and their mastery of web development—which allowed them to envision presenting the material in radically different and simpler ways than were previously considered.

Their early prototypes were imperfect, as Burbach acknowledges. But because they weren’t working inside an agency or beholden to immediate presentation to the public, they could continue tinkering, driven only by the desire to give people the search for government actions most relevant to them. At the Sunlight Foundation contest at the Gov 2.0 conference in fall 2009, their web application GovPulse won second place. They kept working on the application in their spare time, and won first place in a related competition sponsored by the world-renowned Consumers Electronics Show in Las Vegas. They returned to San Francisco, but unbeknownst to them, their work was intriguing government officials on the other coast. The archivist of the United States wanted to incorporate their innovations directly on his domain in time for a seventy-fifth anniversary event. So they formed a company called Critical Juncture, and set out to meet the aggressive 90-day timeline to reconceive and relaunch the FederalRegister.gov site.

In their efforts to democratize access, Burbach, Augustine, and Carpenter presented the material in a format with which the public was extremely familiar: one that resembled a newspaper site, divided into topical headings and sections. It also allowed users to set alerts, so they would be notified when a new government action applied to them or their particular type of small business. On July 15, 2010, the new site launched, and 11 days later, to commemorate the 75th anniversary of the act behind the Federal Register, the trio was honored at the National Archives. Later, the team and its government partners tore down more of the wall between the government and its citizenry. They did this by reinterpreting the existing legal requirement that any regulation that would appear in the Federal Register must be accompanied by a physical copy no less than one day in advance for public inspection. That provision had historically allowed those “in the know” to get a jump on important regulatory matters in advance of the general public, simply by walking into a reading room. But, why not post an electronic copy of the public inspection document for everyone to read? With that in mind, Critical Juncture built an online reading room mirroring the access rights afforded to insiders.

And true to the spirit that moved them to tinker with the documents in the first place, the team set out to publish an application programming interface to enable the next group of innovators to build on top of their work, free of intellectual property constraint or cost. To come full circle, the Sunlight Foundation, home to the initial contest that sparked their involvement, would go on to build a new tool called Scout. This innovation not only improves search capability, it expands it beyond the Federal Register to include any relevant Congressional documents.

“It’s cascaded from an open data side project, meant to be only a couple of weeks, into something that is affecting the regulatory sphere,” Burbach said. “We sort of fell into this, but it’s a good example of how by using open data, you can effect change.”

Open data continues to demonstrate that it can effect change outside American boundaries as well.

In September 2010, President Obama presented global leaders at the United Nations with a challenge: “When we gather back here next year, we should bring specific commitments to promote transparency; to fight corruption; to energize civic engagement; to leverage new technologies so that we strengthen the foundations of freedom in our own countries, while living up to the ideals that can light the world.”37

His first international stop to further this endeavor was far away, but closest to my heart. The U.S. President and India’s Prime Minister, Manmohan Singh, officially announced the launch of the Dialogue on Open Government aimed at delivering tangible benefits in both countries through the joint development of an open data web portal, with a goal to expand its reach to countries around the world. India was an appropriate ally. It had already built a robust technology industry. And, in 2005, it had enacted the Right to Information Act, setting goals related to more accountable and effective government.

On November 7, 2010, at St. Xavier’s College in Mumbai, President Obama toured the first Expo on Democracy and Open Government. The expo featured 10 technology applications utilizing open data to empower India’s citizenry, including an innovative text message service that informs voters within the final two weeks prior to an election if any candidate in their jurisdiction had any criminal records.

The highlight of the tour, however, was the latest innovation championed by Sam Pitroda, who would serve as my counterpart in the Indian government as Advisor to the Prime Minister on Innovation. Together, we would cochair the U.S.-India Open Government Dialogue. Pitroda, as noted earlier, had repeatedly left behind lucrative opportunities in America to pursue his grander mission of empowering the Indian poor, especially the rural poor. First, he set out to do so through the telephone. Then, as chair of India’s National Knowledge Commission, he set out to empower them through the expansion of fiber broadband, with an emphasis on providing greater access to government data.

Now he had the chance to show President Obama how all of that infrastructure investment could impact India. Through video conferencing, he connected the President from the bustling urban campus in Mumbai to a modest local government building in Kanpura, the first village in the country to be connected through optical fiber to rural broadband service. And not just any broadband ­service—rather, service at speeds not available in most parts of America. After the residents held festivals and dressed colorfully in anticipation of the event, some, including local politicians and a sufficient number of English speakers, got to participate in the conversation. Such visual communication, from a major city to a remote village in a country as geographically and economically diverse as India, was an impressive feat in information technology.

Even more impressive than what the President saw, however, was what he heard. People enthusiastically shared stories about how the broadband connectivity had relieved some pressures, gaps, and difficulties in their lives. A student spoke of how, in using it for graduate education, he could stay and care for his mother rather than trekking two towns over for classes. A nurse related how the digital access to health information allowed her to target people in need of immunizations.

Then there was the farmer who, in speaking about seeds and tools, provided the true takeaway from this endeavor. His story illustrated how this access to government information—a result of open government principles powered by technological innovations—could fundamentally improve the way a society operates. In order to borrow money for a farm in India, as anywhere, a bank requires proof of land ownership. That requires traveling, by one means or another, to a distant city center and hoping that the government official turns over the necessary information. And for generations, that has often meant returning empty-handed, and then resorting to borrowing from a local lender at egregious interest rates. Yet, under Pitroda’s vision of a country connected by fiber optic broadband, the villager could get what he needed without the middleman. Within weeks of his village’s connection, the farmer was able to access his own data in a safe, secure manner, printing out his proof and executing a bank loan.

“These are the principles and benefits of e-governance,” Pitroda said.

These were principles and benefits that the President clearly comprehended, as was evident in his comments to the villagers: “One of the incredible benefits of the technology we’re seeing right here is that, in many ways, India may be in a position to leapfrog some of the intermediate stages of government service delivery, avoiding some of the 20th century mechanisms for delivering services and going straight to the 21st.”38

Pitroda was ecstatic about the global attention the interaction received and hopeful that the spotlight would inspire the people of India to continue their efforts. More than anything, he was amazed at the President’s grasp of what Indian leaders were trying to accomplish, especially as Pitroda spoke of the grand vision, to connect 250,000 rural village centers with government information in the same manner. “Most political leaders, in a short period like that, will not get it,” Pitroda said. “He immediately got it; he understood that we are trying to democratize information to empower people, and it is going to result in a better democracy. That’s how we are going to be different from China or anybody else.”

In Pitroda’s view, the fourth phase of empowerment—after telecom, knowledge, and broadband connectivity—is innovation. In that spirit, White House CIO Vivek Kundra and I returned to India in March 2011 to formalize a simple but transformative first step not only toward the implementation of the ideals of the U.S.-India Open Government partnership, but also toward the expansion of the effects of that collaboration beyond our two countries. We would do this through open sourcing of the Data.gov platform, making the platform freely available all over the world, and we would achieve that through the formation of a joint software development team of a dozen developers drawn equally from India and the United States. The developers would work in a modern manner—using Skype in lieu of face-to-face meetings; and GitHub, the leading online code repository, to coauthor and test the software program. That resulted in a beta release about a year later, a free resource available to every government in the world.

Pitroda’s preference was to introduce the platform where the need was greatest, in an underdeveloped nation in Africa. He had met the President of Rwanda while giving a speech on higher education at the State Department and through continued joint efforts between India and the United States. Work to pilot the service is under way.

As we continued to work together directly, independent efforts involving other countries were ongoing and expanding, much of it a result of American leadership—dating back to the President’s September 2010 challenge at the sidelines of the United Nations. By the one-year mark, nearly 40 countries had made explicit commitments to join the newly announced Open Government Partnership, and seven countries had published their specific action plans, made in consultation with their citizens. As of November 2012, 58 countries had made concrete commitments, developed in consultation with their people and with support from a growing NGO community investing in the expansion of capacity to achieve the bold goals.

There’s clearly been a lot of action in open data, over just the past few years. If the journey was judged by the increase in open data sets publicly available, the graph would be shaped like a hockey stick, with the blade pointing left.

So, how do we score its success?

One could score it by the increasing number of data sets that have been made available. On its first day of operations in May 2009, the Data.gov platform hosted 47 data sets. By November 2012, it would host nearly 400,000. One could score it by usage figures—for instance, in the first 24 hours after HHS posted raw hospital list price data, it was downloaded 100,000 times.39 One could try to count up the number of localities to which the movement has scaled, with the federal effort inspiring numerous state and local versions, in major metropolises such as San Francisco, California, and even tiny towns such as Manor, Texas.40

But, in actuality, the power of the movement is even harder to quantify, because what is most encouraging is the increasing share of America’s brainpower that is focused on solving our collective problems, with input from those who had never intended to work on a government project.

Open innovation allows them to contribute, even if it is merely a means to another personal end. Mike Krieger, a Brazilian immigrant, was working for a startup and considering others when he began tinkering with open government data as a weekend distraction. He used San Francisco’s crime information to create the iPhone app CrimeDesk, steering residents away from the most dangerous places to park, walk, or bike. While the public received that bene­fit, Krieger was getting the valuable technical experience he sought, and which would come in handy when he reconnected with former Stanford classmate Kevin Systrom on another potential project: “I wasn’t starting from zero. I had already built an app.” Together they would build the photo-sharing behemoth Instagram, which quickly attracted roughly 100 million active monthly users and Facebook’s attention, with the latter acquiring it for what was heralded as $1 billion, though it turned out to be closer to $715 million.41

As America attempts to get and stay ahead in a variety of industries, it is benefiting from the full force of this data liberation movement in all sorts of expected and unexpected ways. Data, deployed through the latest technology, represents one of what Sam Pitroda calls “the new tools of today to solve the problems of tomorrow.” They are multifunctional tools, a Swiss Army knife of sorts, with functionality in an assortment of sectors, scenarios, and situations. It is becoming a virtuous cycle: as hundreds of data sets are made available, more challenges are conceived, more online communities and entrepreneurial companies create tools for consumers, and more Datapaloozas, beyond the original that spotlighted health, sprout up to showcase the innovations and inspire others to innovate.

The Energy Data Initiative and its related challenges have targeted all areas of the energy spectrum, from fuel economy to environmental protection to consumption awareness. The Safety Data Initiative and its related challenges have focused on everything from emergency response to consumer product recalls to worker safety to drunk driving education to the performance of the body armor worn by law enforcement. The Education Data Initiative is aimed at students from preschool through college, and enables developers to empower those students, giving them better access to test scores, class grades, even federal student loan information.

On May 9, 2013, President Obama kicked off a Middle Class Jobs & Opportunity Tour emphasizing the need for middle-class job creation, the need for Americans to develop the skills to fill those jobs, and the need for American employers to provide hard workers with a fair opportunity and a decent wage. To accompany this speech, he signed an executive order for an Open Data Policy called Managing Information as an Asset.42 It required that newly released government data be made freely available in open machine-readable formats while appropriately safeguarding privacy, confidentiality, and security. By tying these elements together—open government and jobs—he made the open data movement about more than the original purpose of transparency. It was, and is, about that transparency. But it is also about economics. It is about unleashing technology entrepreneurs to create products and services that consumers need and use, so the resulting economic activity can foster job creation.

At this stage, if anything’s holding the movement back, it’s awareness.

That’s why the President’s continuing emphasis is so significant. So are the efforts, throughout the administration, to publicize activities in this area, for those with expertise in everything from public safety to health care to education to energy to global affairs and so forth. As Todd Park noted, “If you are in these spaces and do not know this stuff is available, then it’s like being in the navigation business and not knowing that GPS exists. There are big, game-changing data resources that are being made available through government action that every entrepreneur is going to want to know about, and some already do.”

The mission?

That everybody will. And then, as Todd Park puts it, “entrepreneurs can turn open data into awesomeness.”