The Obama administration’s most radical idea may also be its geekiest: Make nearly every hidden government spreadsheet and buried statistic available online, all in one place. For anyone to see. Are you searching for a Food and Drug Administration report that used to be obtainable only through the Freedom of Information Act? Just a mouseclick away. Need National Institutes of Health studies and school testing scores? Click. Census data, nonclassified Defense Department specs, obscure Securities and Exchange Commission files, prison statistics? Click click. Click. Click.
The man in charge is the US government’s first-ever chief information officer, Vivek Kundra. Previously CTO of the District of Columbia, Kundra, 34, knows that the move from airtight opacity to radical transparency won’t be a cakewalk. Until now, the US government’s default position has been: If you can’t keep data secret, at least hide it on one of 24,000 federal Web sites, preferably in an incompatible or obsolete format.
The goal of Kundra’s new Web site, Data.gov, is to create a place where all the information is easy to find, sort, download, and manipulate. He wants to put as much data out there as possible, then sit back and let the private sector come up with great ways to use it. He envisions a future in which well-designed spreadsheets, charts, and graphs are embedded in applications for phones, Facebook, and blogs. In DC, someone combined several of the data sets released by local government–maps, liquor license info, crime statistics–into an app called Stumble Safely, which shows users the safest way to walk home when drunk. He doesn’t know what people will build with all the federal data, but he’s confident it will be cool.
Since Barack Obama took office, Wired has been running its own public wiki, on which scores of people have posted suggestions for how Kundra should proceed–which data sets to open first, what mashups might yield interesting results, and what existing Web sites to use as models. The response suggests a real appetite for what Kundra is proposing, so we paid a visit to the White House just prior to Data.gov’s launch to see how his plans are developing.
Wired: Where do you start?
Vivek Kundra: One, we’re going to look at which feeds are most popular and which the public are demanding. Two, we want to advance the president’s agenda around health care, around energy, around education.
Wired: But won’t people say you’re releasing one feed because it makes Obama look good but not another that includes something embarrassing to the administration?
Kundra: Well, look at health care. As the president said, it’s one of the most urgent problems affecting our economic future. So it makes sense to get the most innovation in that space.
Wired: Give me an example.
Kundra: There’s a lot of data out there–from the National Institutes of Health, the CDC, the FDA–concerning outbreaks and pandemics. And there’s lots of Census Bureau data right now. For the first time, the bureau is going to be noting GPS coordinates for addresses across the country. There are privacy issues, obviously. But if you release that data at a national level, all of a sudden you’ve got a new layer of information that has never existed before. Imagine if you could build an iPhone app that combined the GPS info with addresses and then combined that with data about outbreaks.
Wired: You’d know precisely where outbreaks were occurring? Sort of like Google flu trends except better, because instead of search data you’re using real medical data?
Kundra: Exactly. And the government doesn’t even have to create the applications.
Wired: What do you mean? You’ll release the data and just hope people do interesting things with it?
Kundra: Yes. Think about the Department of Defense. When satellite data was made available, you had this explosion in the private GPS market. Now GPS is available on your iPhone, so if you’re lost you can navigate. The car rental industry uses it. Google and Facebook use it to help you get real-time information on where friends are and where the closest restaurant is. The key is recognizing that we don’t have a monopoly on good ideas and that the federal government doesn’t have infinite resources. We’re even thinking about running competitions for people making applications. What wired was able to do with that Data.gov wiki, frankly, would have cost the government a fortune and taken much longer.
Wired: Given how complicated this effort will be, are there some simple rules you’re going to follow?
Kundra: The core principles are using open standards, presenting raw data, and distributing it in as many formats as possible. Public policy decisions are made using the data anyway, but the raw data is important because if it is massaged too much, you can lose the big issues.
Wired: Sometimes more data confuses rather than clarifies, especially if it’s raw or presented in some clumsy spreadsheet, which is typically how government data has been released in the past, if at all.
Kundra: But we now have the ability to use data in ways we couldn’t before, and to do it in a machine-readable way where we can not only spot trends and make intelligent decisions but make applications that create value and economic opportunities. The perfect example at a local level is in DC, where you can download an application that lets you know–based on where you’re standing–what the closest Metro station is, when the next train is coming, and, if you like Mexican food, where the closest Mexican restaurant is. That’s built on one subset of data feeds, and there are hundreds of others.