Personal Democracy Plus Our premium content network. LEARN MORE You are not logged in. LOG IN NOW >

Developers Are Already Submitting Patches to Obama's New Open Data Policy

BY Nick Judd | Thursday, May 9 2013

Photo: Tom Lohdan / Flickr

The White House on Thursday morning released an executive order from President Barack Obama that mandates any data in information systems created by government agencies going forward be available for anyone to access, download, and use.

Administration officials say this will present opportunities for entrepreneurs across the country.

"We sit on a treasure trove of data in government," White House Chief Information Officer Steven VanRoekel said today during a conference call with reporters. "Today most of that data is locked up in paper, proprietary systems and other things. As part of this motion, government agencies are going to, as they create new or modernize their existing systems, by default they will be required to make their data in those systems open and machine readable."

In recent years, VanRoekel said, hundreds of companies have launched with a focus on government data, creating thousands of jobs.

The new directive will also be a boon for transparency, VanRoekel said.

"This, we feel, will create opportunities around transparency and efficiency inside the walls fo government as well as fuel economic opportunity on the outside," he told reporters.

The order declares that the "default state" of government information resources published or modernized going forward "shall be open and machine readable," with exceptions for privacy, confidentiality, and national security. Accompanying the order is a memo from the Office of Management and Budget that requires agencies to maintain an "enterprise data inventory" — a list of all its databases — and, based on that inventory, release a list of datasets available to the public.

The policy is posted to GitHub, a repository for open-source projects, and people in the field of technology in government are already proposing changes through GitHub's chosen method, pull requests.

Citing a new hospital billing data release, VanRoekel said that these datasets will create new opportunities for companies who can turn data into valuable market intelligence for consumers. Home buyers, for instance, could benefit from listing agencies who augment their existing information with data on neighborhood crime statistics, broadband accessibility, average energy consumption and cost, and other federal data, the White House CIO told reporters.

Federal officials are not immediately available to clarify VanRoekel's statement about the "hundreds of companies" or "thousands of jobs" created through open government data, but promise a response. (I'll update this post when it arrives.) Officials hope government data will make itself useful across many industries and the White House is naming no specific target. The memo specifies that agencies "adopt a presumption in favor of openness to the extent permitted by law and subject to privacy, confidentiality, security, or other valid restrictions."

To assist agencies in meeting the demands of this new initiative, OMB and the Office of Science and Technology Policy have launched "Project Open Data," described in the OMB memo as an "online repository" of information and schema to help agencies get with the program. Already live, it includes several tools, such as software that creates a programming interface for data stored in common databases or in CSV format, a common file type for storing data.

Agencies have six months to revise their policies and create a public data listing of all available datasets. Requirements and specifications for the collection and storage of new data apply only to datasets collected going forward.

Open government and transparency activists are still discussing the memorandum's utility for their own work.

In a blog post, Josh Tauberer — who built GovTrack, which is for now the best place to go for updated data on the doings of Congress — says he's concerned that the White House is mandating the use of "open licenses." Open license is different from "public domain," he says, and that means it can be used to prevent access or use.

"A public domain dedication differs from an open license in that it disclaims copyright and other protections, whereas, again, an open license implies that such a limitation on use is already present," he writes. "The CC0 statement [a pre-built statement about intellectual property rights composed by Creative Commons] was successfully used by the Council of the District of Columbia to disclaim copyright over data files containing the DC Code."

What's more, he adds, citing New York Times developer Derek Willis, the government obliges agencies to consider the "mosaic effect" of its data — that is, the ability for datasets, when cobbled together, to reveal personally identifiable information or potentially compromise "security." It's a potentially overbroad exemption that might allow government officials to misconstrue "national security" for their own "job security," in other words, and in so doing block access to valuable insight about what government is doing.

Although Obama made catching up on a backlog of Freedom of Information Act requests a priority in his 2009 Open Government Directive, watchdogs still peg federal responsiveness to FOIA requests at 55 to 60 percent. So it's unclear if this directive really does break any ice on transparency.

Officials were not immediately available to respond to a follow-up request for comment.

The Sunlight Foundation's* John Wonderlich is thrilled that the White House has adopted a "default to open," but notes some cautions:

To be sure, getting agencies to publicly list all their data that can be open will be a significant challenge, even with a high-profile Executive Order. Concerns like cost, privacy, and security will be used to justify non-disclosure (as they often are), and will be used to try to justify keeping even a description of many datasets private. That's a good struggle to have, though, and one we're looking forward to. Without this Executive Order, too many agencies are managing data holdings that they haven't comprehensively reviewed, without public oversight, while advocates, journalists, and policymakers have an unclear view of what agencies know, and what they could be releasing.

"Agency-wide comprehensive audits of datasets," Wonderlich added in a follow-up email, "is a big and aggressive move."

After all, one of the big complaints from transparency activists is that they can't ask for data until they know what they can get. Soon, that is supposed to change.

* TechPresident publisher Andrew Rasiej and editorial director Micah Sifry are senior advisers to the Sunlight Foundation.

This post has been updated.

News Briefs

RSS Feed thursday >

NYC Open Data Advocates Focus on Quality And Value Over Quantity

The New York City Department of Information Technology and Telecommunications plans to publish more than double the amount of datasets this year than it published to the portal last year, new Commissioner Anne Roest wrote last week in an annual report mandated by the city's open data law, with 135 datasets scheduled to be released this year, and almost 100 more to come in 2015. But as preparations are underway for City Council open data oversight hearings in the fall, what matters more to advocates than the absolute number of the datasets is their quality. GO

Civic Tech and Engagement: Announcing a New Series on What Makes it "Thick"

Announcing a new series of feature articles that we will be publishing over the next several months, thanks to the support of the Rita Allen Foundation. Our focus is on digitally-enabled civic engagement, and in particular, how and under what conditions "thick" digital civic engagement occurs. What we're after is answers to this question: When does a tech tool or platform enable actual people to make ongoing and significant contributions to each other, to a place or cause, at a scale that produces demonstrable change? GO

monday >

Tweets2Rue Helps Homeless to Help Themselves Through Twitter

While most solutions to homelessness focus on addressing physical needs -- a roof over the head and food to eat -- one initiative in France known as Tweets2Rue knows that for the homeless, a house is still not a home, so to speak: the homeless are often entrenched in a viscous cycle of social isolation that keeps them invisible and powerless. GO

Oakland's Sudo Mesh Looks to Counter Censorship and Digital Divide With a Mesh Network

In Oakland, a city with deep roots in radical activism and a growing tech scene at odds with the hyper-capital-driven Silicon Valley, those at the Sudo Room hackerspace believe that the solution to a wide range of problems, from censorship to the digital divide, is a mesh net, a type of decentralized network that is resilient to censorship and disruption and can also bring connectivity to poor communities.