Personal Democracy Plus Our premium content network. LEARN MORE You are not logged in. LOG IN NOW >

How Governments Should Release Open Data

BY Jessica McKenzie | Tuesday, August 20 2013

When releasing data, governments should know that format matters almost as much as content. If it is clean, well organized, complete and in a machine-readable format, even a nonprogrammer can make good use of it. A recent post from Craig Thomler, who blogs about eGovernment and Gov 2.0 in Australia, illustrates this point.

Thomler describes taking a basic data set – the expected polling places for the federal election – and transferring the information onto the map, which is more visually appealing and informative than a list of names and locations. The process sounds relatively simple, something anyone could manage after a bit of Googling:

So I downloaded the CSV file from the AEC website and went to Google Drive, which supports a type of spreadsheet called Fusion Tables which can map geographic data.

Fortunately the AEC was smart enough to include latitude and longitude for each polling location. This can be easily mapped by Fusion Tables. The CSV also contained address, postcode and state information, which I could also have used, less accurately, to map the locations.

I uploaded the CSV into a newly created Fusion Table, which automatically organised [sic] the data into columns and used the Lat/Long coordinates to map the locations - job done!

This is where Thomler hits a snag that could throw someone with less experience. Only a third of the polling locations appeared on the map. He solves it by the simple expedient of deleting extraneous data from the data set while in Excel, and it worked.

This is not something the government could have changed without releasing less data, which nobody wants because the columns Thomler deleted could have been the data someone else sought. The government, however, could have gone one step further. The form in which they released the data, convenient, easy and fast for the government, is less useful for the community than it could be.

Thomler observes that his map could be out of date within days. A programmer could write a code to check the website and update the map automatically with new information, but open data shouldn't be just for programmers:

To replicate what the programmer could do in a few lines, any non-programmer, such as me, would have to manually check the page, download the updated CSV (assuming the page provides a clue that it has changed), manually delete all unneeded columns (again) and upload the data into my Fusion Table, simply to keep my map current.

Of course, if the AEC had spent a little more time on their data - releasing it as a datafeed or an API (Application Programming Interface), it would be easy even for non-programmers to reuse the data in a tool like Google Maps for public visualisation - or the AEC could have taken the one additional step necessary to map the information themselves (still providing the raw data), providing a far more useful resource for the community.

In the same post, Thomler describes the process of culling data from Twitter, which is public but not open. It is good advice for any non-programmer interested in doing the same.

Personal Democracy Media is grateful to the Omidyar Network and the UN Foundation for their generous support of techPresident's WeGov section.

News Briefs

RSS Feed friday >

NYC Politicians and Advocacy Groups Say Airbnb Misrepresents Sharing Economy

A coalition of New York election officials and affordable housing groups have launched an advocacy effort targeting Airbnb called "Share Better" that includes an ad campaign, a web platform, and social media outreach. GO

First POST: Data Dumps

The Internet Slowdown's impact on the FCC; Uber drivers try to go on strike; four kinds of civic tech; and much, much more. GO

thursday >

First POST: Positive Sums

How Teachout won some wealthy districts while Cuomo won some poor ones; DailyKos's explosive traffic growth; using Facebook for voter targeting; and much, much more. GO

wednesday >

First POST: Emergence

Evaluating the Teachout-Wu challenge; net neutrality defenders invoke an "internet slowdown"; NYC's first CTO; and much, much more. GO

tuesday >

De Blasio Names Minerva Tantoco First New York City CTO

Mayor Bill de Blasio named Minerva Tantoco as first New York City CTO Tuesday night in an announcement that was greeted with applause and cheers at the September meeting of the New York Tech Meet-Up. In his remarks, De Blasio said her task would be to develop a coordinated strategy for technology and innovation as it affects the city as a whole and the role of technology in all aspects of civic life from the economy and schools to civic participation, leading to a "redemocratization of society." He called Tantoco the perfect fit for the position as a somebody who is "great with technology, has a lot of experience, abiltiy and energy and ability to create from scratch and is a true New Yorker." GO

First POST: Fusion Politics

The Teachout-Wu Cuomo-Hochul race as it comes to a close; more criticism for Reddit as it prepares a major new round of funding; First Lady Michelle Obama as an Upworthy curator; and much, much more. GO

monday >

In Czech Republic, NGOs Launch Anti-Corruption Campaign

“We have a plan to end corruption. And we need your help to make it happen” This is the message launched by Czech NGOs to citizens in an effort to rebuild trust and credibility towards the institutions, a even more urging need, after a huge corruption scandal disrupted the political scene a little more than a year ago. The NGOs agenda is made of 9 laws and is the core of a project called Rekonstrukce Státu (Reconstruction of State), a joint effort of more than 20 civil society organizations. GO