You are not logged in. LOG IN NOW >

Entering a New Era of Open Data in the U.K.?

BY Rebecca Chao | Tuesday, September 17 2013

Not your average data catalog (dfulmer/flickr)

The U.K. government, last week, began releasing its inventory of hitherto "unpublished" data on while also allowing users to comment on the quality and content of the data. Is the U.K. onto something new or is it some of the same old?

Globally, the U.K. is one of a few countries that have begun this indexing process. The U.S. announced it will be releasing a catalog soon and Canada already has one. These catalogs are part of what the Sunlight Foundation has called a "new evolution of open data."

The release of the data catalog began, albeit slowly, on Monday, Sept. 2, at precisely 5:00 p.m. GMT. A blog post written by Matt Llyod on the U.K. data site explained, "we want you to tell us what we should be releasing and why."

The government says the purpose of this exercise is to help gauge what data is important to its citizens and help it identify "core departmental data." There is a search function for the catalogue that allows users to filter by published or unpublished items and filter by publisher. Each unpublished dataset provides information on “the publisher, a description of its content, and notes about its possible release.”

After reviewing user comments, the government will make a determination on what constitutes "core departmental data" and plans to announce its findings at the Open Government Summit in late October.

The Sunlight Foundation has been calling for some time for a comprehensive list of all data held by government agencies. They write in their Open Data Guidelines that these catalogs are important first and foremost because governments themselves need to find out what information they have. An inventory of data “empowers policymakers and administrators to determine whether information is being appropriately managed, and empowers the public oversight of those determinations.”

Laurenellen McCann, the National Policy Manager at Sunlight, wrote in an e-mail, "In spirit, many of the mechanisms the UK government describes for publishing data and collecting public input on data release and priorities seem solid and in line with what other governments do or are exploring doing for their similar initiative, though it's hard to say for certain what the quality of the UK's data catalog is without deeper review."

At a non-national level, the U.K. has already produced such catalogs. For example, as McCann notes in a blog post that she co-penned with fellow staff member Júlia Keseru, that the Department for Communities and Local Government in the UK publishes a list of what data it holds and provides information on why unpublished information is not made public. McCann and Keseru noted in the post that it is “a best practice that we are eager to see the U.S. emulate."

The U.S. announced in May that it would also release a catalog of its unpublished data but we might not see any results until November.

Canada also has a catalog, McCann noted, but it may be incomplete. "Canada is one of a few national-level examples we could find, but it’s hard to tell the scope and completeness of its indexing policy and related activity,” she and Keseru wrote. The Canadian Data Inventory Project began around April of 2012 and involves 18 different governmental departments and agencies.

What makes the U.K. cataloguing process unique, however, is its cataloguing of unpublished data. “As far as I know, this is the first time 'unpublished datasets are being cataloged,” wrote civic hacker Joshua Tauberer in an e-mail. “[In the U.S.], the new open data memorandum this summer directed US federal agencies to do better inventorying, so we may see something like this here eventually.” Tauberer created the government transparency site, GovTrack and is the president of Civic Impulse.

However, users of the U.K. data site have noted in the comment section that the cataloging of “unpublished” data has not been properly defined by the government.

One user wrote, "Seems this is a new definition of 'unpublished' or at least an extension/change to the acknowledged definition." The user also argued that just because the data is "published" on the U.K. data website does not mean it has actually been released by the data holder, giving rise to issues over licenses -- what data can be reused and what cannot?

McCann explained that this was an important discussion. She told TechPresident, “ ‘Published' data should mean more than whether or not data is published on, so that when the government refers to increasing the pool of published data it also means increasing the greater pool of public data.”

An example of a disclaimer on some data sets, indicating while it is available, it has not yet been published by the government.

The same commenter on the U.K. data site also argued that the comment function was not consistently available on all data sets, which would lead to "skewed" input.

An example of the short survey available on some data sets.

Tauberer said that he was doubtful about the effectiveness of the commenting function on the U.K. data site because it appeared to be "a generic, empty form." Further, he explained that the people interested in open data have already formed their own communities and networks.

"If government agencies want to know what data to release next, just go to those communities and ask," he wrote. "Or just ask the agency's own staff, because chances are some people in those communities are already pestering the agencies for more or better data. In fact. there are folks in the UK who have been asking for data for years who are probably insulted by the title of that blog post: "Get involved!" It's not the people who need to be more involved in this, it's government staff.

Disclosure: TechPresident's Micah Sifry and Andrew Rasiej are senior advisers to the Sunlight Foundation.

Personal Democracy Media is grateful to the Omidyar Network and the UN Foundation for their generous support of techPresident's WeGov section.