In China, An Open Data Movement is Starting to Take Off
BY Rebecca Chao | Thursday, April 24 2014
About eight months ago when techPresident first wrote about the state of open data in China, there were only three non-user friendly government open data sites and a smattering of open data enthusiasts who often had to find their own data sources and even create hardware to generate their own data. They were not a formally connected group but rather, individuals who created open data apps out of personal interest. Now, the recently launched Open Data Community is trying to create a multi-disciplinary network of businesses, research institutes, and NGOs interested in open data.
The network announced its launch, and the accompanying website Open Data China on Open Data Day, which fell on February 22 this year. Its goal is to to foster an interest in open data in China through a number of initiatives: offer open data literacy trainings, encourage NGOs and research institutes to release their data, and also teach the general public, journalists and NGOs how to analyze data once it is available. "We also want to become a platform to support data innovation to accelerate projects that are not well supported," Feng Gao tells techPresident. Gao is the Shanghai-based open data ambassador for the Open Knowledge (formerly the Open Knowledge Foundation) China branch, one of the community's founding groups. "It will provide a great space for researchers and NGOs to create [projects] around free information," he says.
Another founding group of the Open Data Community is the Urban Data Party, a Chinese organization that encourages city planners to use open data to improve urban planning. Urban Data Party is planning an event in early May that will include debates about open data and provide a look at what open data looks like at the city level. Other founding groups include the Data Scientist Community, which focuses on big data, statistics and data science; and QingYue IT Engineers for Environment, an NGO that focuses on using data to protect the environment.
How Open Data Got Started in China
The Open Data Community is currently working on three projects, one of which is a comprehensive timeline of open data in China where OFKN China has potentially traced the open data "movement" (if you can call it that) to its beginnings. (Note: some of the information from the timeline is sourced from techPresident's "Hunt for Open Data in China.")
According to the timeline, the Chinese government's first open data website was Shanghai's Internal Data Directory launched some time around September 2011, though the date is not clear. The government does little to publicize the launch of these sites, says Gao. The current data list includes 425 data sets. The Shanghai government later released some data on the Shanghai data portal, launched in December 2012. Beijing's open data site went online in October 2012 with 4,000 datasets to date, followed by that of the National Bureau of Statistics site in September 2013.
To get a sense of how fledgling the idea of open data is in China, the first book on open data in the Chinese language, The Big Data Revolution, was published in July 2012 and written by Tu ZiPei, but it is not even about open data in China; it discusses open data in the U.S. to show how much China also needs an open data movement.
Apparently, the book was well received by some government officials like Wang Yang, currently the third-ranked vice premier (there are currently four ranks). Inspired by the book, Wang gave a speech to the Guangdong Finance Department on October 8, 2012, when he was then the Guangdong Committee Secretary, mentioning China's own need for open data -- though Wang used the term "public data," writes Gao in his blog post covering the event. In the speech, available via Forbes in Chinese, Wang says that the U.S. has responded to the development of big data while China has lagged and must catch up since big data is a source of innovation, competition and productivity. Notably absent is any mention of "transparency" as another important by-product of open data; Wang seems more focused on data collection, preservation, maintenance, and analysis rather than on cultivating open data.
Several months later, China held its first ever hackathon using public data called Code for Climate Change, which techPresident previously covered. Liu Yan, the creater of a hackerspace called Xindanwei, meaning 'new work unit,' which is a play on the government work units, had told techPresident at the time, "When I heard about this project I was very excited. This is the first time that the government is providing all this data to the start-up and creative community and is working together with them by providing data sets. Also, top researchers from all over China are providing insights and knowledge. I was super excited because for start-ups, this is really a very important and unique opportunity to be connected to these data sets."
In the fall of 2013, Fudan University in Shanghai held an international meeting on e-governance and hosted a few sessions on open data and open government by a number of government officials and academics. There, a government official, Tang DingChun, the deputy head of Shanghai's Freedom of Information Department, explained the process of creating the Shanghai open data portal. The hurdles Tang pointed out were very much the same bureaucratic hurdles common with most governments trying to release open data: departments who do not want to share data, redundancies, digitizing data and the overall lack of a strong structure for managing and collecting data.
Surprisingly, when the topic of data fees came up, Tang expressed that in his personal opinion, the data should be kept free of charge, therefore making it truly open. But his personal views may not be in line with the goals of the Shanghai government, which is debating whether to fund the data collection and release through fees, though the Shanghai data portal currently remains free of charge.
Since then, there have been a number of conferences with discussions of open data, though usually as a sidebar to discussions of big data.
A Strategy for Open Data
Since its launch two months ago, the Open Data Community has really just begun the arduous task of creating an open data culture in China from scratch but their strategy appears to be comprehensive and strategic such as breaking down barriers to data access, generating resources on open data and encouraging data use.
Before you can begin to use data, you have to know what's available. So another ongoing project of the Open Data Community is a graded survey of all available data in major Chinese cities (at this point only Shanghai and Beijing) based on a number of categories such as urban public transportation, air quality, census, health, and economic indicators, among others. As of now, of the two cities that have been graded, about 50 percent of their datasets are open, meaning they are machine-readable.
Since most of the educational materials on open data are not written in Chinese, the Open Data Community is also working on translating English-language articles (which include some of techPresident's content), books and other resources. One of its long term projects is a Chinese translation of Code for America's book on open data, Beyond Transparency.
With the data that is available in China, much of it has to be "opened" or turned into a machine-readable format in order to be useable. Gao explains that he and his team are beginning with opening water data.
"We feel for instance air quality data is accessible but water is not even though it is another big problem in China," says Gao. While the data comes from public sources like the government, which updates each river area in the country every few hours, the data needs to be captured through a script and then manually edited or cleaned up. "Once we open such data, we can map a water source nearby and indicate how good the water source is and whether it is safe [to drink] or not, for example," says Gao.
Gao makes it clear, however, that what public data they currently have is still very meager. "It’s still very hard because of government concerns for national security so they don’t release all of the data. Some data is still classified -- mainly the coordinates of water plants or water sources -- even though water planning maps can be public and we use them to draw conclusions on Google maps and point out possible coordinates."
Still, the Open Data Community hopes that after making a coordinated effort to clean up the data, it will encourage civil society to use it and that the Open Data Community can serve as an incubator for data apps and projects.
The Government's Role
Gao says that there are signs that more and more, open data is on the radar of not just civil society but also of the government. He says, for example, that in the beginning of January, the city of Guangzhou announced that it plans to create a department devoted to work on big data and open data.
"Since last December we have seen many more people, some well known researchers, talk about open data especially in the context of smart cities. We've also seen the release of data on the government side," says Gao. He says part of the push is from entrepreneurs who want access to data and is also pushing the government for the digitization of services.
Gao says that the Open Data Community has received interest from government officials as well. They were approached by a few government staff working for the IT sectors of the government, such as the IT engineer who worked on the open data portal in Beijing. "He approached us saying he thinks open data is very interesting and wanted to discuss it with us further," says Gao. He also received interest from researchers working for government-funded institutions who want more advice and guidance on how to draft open data policy plans for the government.
Gao also says that the Open Data Community has discussed with Code for America whether their model could potentially be replicated in China at the city level. But for that to come to fruition, the Open Data Community needs more clarity from the government on its open data policies; they are currently waiting for the government to release its national plan on open data, which is slated to be released fairly soon, though there is no set date.
Personal Democracy Media is grateful to the Omidyar Network and the UN Foundation for their generous support of techPresident's WeGov section.