Until recently, government data made its way to the Internet primarily through central planning: civil servants gathered the raw data generated by their work, processed and analyzed it to make maps, reports, and other informative products, and offered these to citizens seeking insight into school performance, crime in their neighborhoods, or the status of proposed laws. But a new, more dynamic approach is now emerging—one that enlists private actors as allies in making government information available and useful online.
A portion of this chapter was previously published as “Government Data and the Invisible Hand,” Yale Journal of Law & Technology, Vol. 11, 2009.
When the Web was born, computational and network resources were so expensive that building large-scale websites required substantial institutional investment. These inherent limits made government the only free provider of much online civic information, and kept significant troves of data off the Web entirely, trapped in high-end proprietary information services or dusty file cabinets. Government officials picked out what they thought to be the most critical and useful information, and did their best to present it usefully.
Costs for storage and processing have plummeted, but another shift, less well known, is at least as important: the tools that let people develop new websites are easier to use, and more powerful and flexible, than ever before. Most citizens have never heard of the new high-level computer languages and coding “frameworks” that automate the key technical tasks involved in developing a new website. Most don’t realize that resources such as bandwidth and storage can be bought for pennies at a time, at scales ranging from tiny to massive, with no upfront investment. And most citizens will never need to learn about these things—but we will all, from the most computer-savvy to the least tech-literate, reap the benefits of these developments in the civic sphere. By reducing the amount of knowledge, skill, and time it takes to build a new civic tool, these changes have put institutional-scale online projects within the reach of individual hobbyists—and of any voluntary organization or business that empowers such people within its ranks.
These changes justify a new baseline assumption about the public response to government data: when government puts data online, someone, somewhere, will do something innovative and valuable with it.
Private actors of all different stripes—businesses and nonprofit organizations, activists and scholars, and even individual volunteers—have begun to use new technologies on their own initiative to reinvent civic participation. Joshua Tauberer, a graduate student in linguistics, is an illustrative example. In 2004, he began to offer GovTrack.us, a website that mines the Library of Congress’s (LOC) THOMAS system to offer a more flexible tool for viewing and analyzing information about bills in Congress (see Chapter 18). At that time, THOMAS was a traditional website, so Tauberer had to write code to decipher the THOMAS web pages and extract the information for his database. He not only used this database to power his own site, but also shared it with other developers, who built popular civic sites such as OpenCongress and MAPLight (see Chapter 20), relying on his data. Whenever the appearance or formatting of THOMAS’s pages changed, Tauberer had to rework his code. Like reconstructing a table of figures by measuring the bars on a graph, this work was feasible, but extremely tedious and, ultimately, needless. In recent years, with encouragement from Tauberer and other enthusiasts, THOMAS has begun to offer computer-readable versions of much of its data, and this has made tools such as GovTrack easier to build and maintain than ever before.