[BackChannel] The Promise of "Small Data"

BY Jeffrey Warren | Wednesday, July 17 2013

techPresident's Backchannel series is an ongoing conversation between practitioners and close observers at the intersection of technology and politics. Jeffrey Warren is co-founder and research director of the Public Laboratory for Open Technology and Science and a fellow in MIT's Center for Civic Media.

Big Data -- the idea that the ability to aggregate and sift through vast amounts of data can yield key insights about our society and provide the basis for better decision-making -- has a fatal flaw. The power and opportunity for abuse that Big Data brings is more evident than ever in the wake of the recent revelations about the NSA's data collection programs. But the fallibility of such vacuum-cleaner data collection systems goes beyond their lack of transparency. They raise serious questions about how our idea of a modern democracy can be adapted to an increasingly data-centric and technocentric society.

In this Wikipedia world of participatory creation, why do we assume that it is only the NSAs, Googles, and Facebooks of this world who can leverage such data-derived power? To me, Big Data is far less compelling than "Small Data" -- the idea of a bottom-up, voluntary, shared model of data aggregation whose participants are not mere data points. Small Data ecosystems -- in contrast to Big Data silos -- will be built on the open exchange of data, by and for the public, towards civic ends.

Small Data may not be as applicable to national security as it is to many other problems our society faces, from environmental threats to challenges to civil liberties, but it holds the promise of an open approach which will support, rather than disrupt, our democracy. Small Data goes beyond crowdsourcing; it is based not merely on public contribution to a common goal, but on the public having a say in how that pooled data is used, and what questions it answers.

The need for such a participatory approach is clear. Data and its interpretation and application increasingly drives decision making in our society, a fact that sometimes results in better decisions, but often results in a system which gives "data shepherds" -- scientists, technologists, or analysts -- exclusive say over reading these "data tea leaves." This is a problem if it displaces the discursive mode of debate which is the foundation of our democracy, because it may be contingent on the biases of technologists, not only in employing data, but in the very framing of the questions that data seeks to answer.

An example: a series of publicly-viewable “crime maps” based on police reports have been promoted by technologists as a way to understand where crime happens -- and where you shouldn’t go walking at night. But couldn’t the maps just as easily have been used to better understand whether there is systematic bias in arrests on the basis of race? Do the maps show white-collar crimes, or convey the magnitude of their effect on the public? The maps may not tell a clear story either way, but many are quick to suggest that the data “speaks for itself” -- which is clearly a matter of perspective. Such systems overlook the opportunity for a democratized Small Data version of data-driven information systems in which the public may critique data and even participate in its collection or production.

Fundamentally, the Big Data movement is problematic because it is premised on an asymmetric -- purely upward -- flow of data towards a central authority whom we must trust to make decisions on our behalf. In this age of participation, of Do-It-Yourself, of civic empowerment, shouldn't the little guy have a seat at the table? We have technologies which enable affordable, accessible data production, collation, management, visualization and analysis, so let's leverage it in the name of Small Data.

Nowhere is this opportunity more relevant than in the environmental space, where the data reflect the threats in our immediate surroundings -- threats that we and our families face every day. At Public Lab, a non-profit I co-founded, we develop just the kind of affordable, accessible, open tools for data collection -- environmental tools, in our case -- which can enable greater meaningful input by everyday people in big data-driven decisions that affect us all. We seek to give those disproportionally affected by the health and safety hazards of pollution a voice in debates about the remediation of urban brownfields, the extent of environmental damage, and the placement and safe operation of industrial facilities. Our tools, from homebrew infrared cameras to smartphone-mounted chemical sensors, may seem obscure, but so did the legendary Apple I at a time when computers were the sole province of government and industry, and the idea of a "personal computer" giving anyone the power of computation seemed a strange and esoteric fantasy.

Imagine a future where data plays an ever larger role in governance and civic life, but where the public's literacy in, and ability to contribute to and leverage, that data gives it a meaningful part in fair and informed decision making. Such a “Small Data” future needs not only the kind of transparency championed by groups like the Sunlight Foundation, but participatory approaches advanced by citizen science projects such as the Air Quality Egg, the Smart Citizen air quality sensor or Public Lab’s Infragram camera challenge the public’s attitude of "leaving science and technology to the experts." They go beyond simply promoting science and technology education to advocating for the everyday application of such themes by regular people, whose ability to converse in the language of Big Data, and to participate at a high level in data-driven public discourse, will be the foundation of stronger democracy.

TechPresident Editorial Director Micah Sifry serves on the board of Public Lab.