You are not logged in. LOG IN NOW >

White House Highlights Big Data Partnerships

BY Miranda Neubauer | Wednesday, November 13 2013

Climate data for the Cloud (NASA)

The White House Tuesday highlighted several new and recent partnerships and collaborations focused on data in the areas of urban policy, development, science, health and research to further the goals set by the Big Data Research and Development Initiative in 2012.

At that time, six federal agencies announced over $200 million in commitments to support projects and tools dedicated to using, accessing and analyzing data to address national challenges. In the initiative's second year, the administration has challenged federal agencies, private industry, academia, state and local government, non-profits, and foundations to develop Big Data initiatives "to advance national goals such as economic growth, education, health, and clean energy; use competitions and challenges; and foster regional innovation," Tom Kalil, deputy director for Technology and Innovation in the Office of Science and Technology Policy, wrote in a blog post in April.

Among the announcements Tuesday is the news that Amazon Web Services is partnering with NASA to provide public access to data about Earth through the NASA Earth eXChange. As part of a new NASA Space Act Agreement, AWS will be hosting a significant amount of NASA's earth-observing data as an AWS Public Data set, according to a White House fact sheet. NASA will be offering machine images, workshops, tutorials and other resources to help earth sciences researchers draw on data in the cloud in their daily work, and the development will mean that the calculation of the next National Climate Assessment can take place in the cloud,

In addition, the United States Geological Survey will be providing $9 million for a Big Earth Data initiative in collaboration with the National Oceanic and Atmospheric Administration (NOAA), NASA, and the United States Department of Agriculture.

The White House also highlighted the recent partnership between DataKind, which connects non-profits facing data analysis challenges with pro-bono data scientists, and Pivotal for Good, which will be contributing to those "data philanthropy efforts," Google's and USAID's support of the World Resources Institute's Global Forest Watch 2.0 monitoring tool and the Kamusi project, which is working on establishing a dictionary of every word of every world language, with support from the National Endowment for the Humanities.

Another announcement Tuesday was that Splunk4Good, the social responsibility initiative of Splunk, the machine data analysis company, will be analyzing large and complex data to help create a new, public interface for Regulations.gov that will let users search the federal regulatory data through real-time graphs, dashboards and visualizations. The announcement notes that the Sunlight Foundation draws on Regulations.gov for its Docket Wench platform, which allows users to analyze over million documents and visualize how outside groups influence rulemaking, and is working on expanding on the platform with natural-language and machine-learning functionalities. The White House also points out that NoticeandComment.com, which publishes public notices, is developing a new publishing platform based on its framework for Regulations.gov as a model for a system aimed at local governments to help with the publication and management of the $17 million in public notices generated by the over 89,000 municipal and special district governments in the U.S. each year.

In the area of science, The White House highlighted a collaboration between the technology companies SGI and Fedcentric to create a system for the U.S. Postal Service that can manage the over 8 billion transactions and 275 terabytes handled every day with the goal of creating new capabilities to route mail and expedite shipping, delivery and service.

Additionally, the White House pointed out the National Science Foundation's support for data analytics research, UC Berkeley’s Algorithms, Machines, and People Laboratory's release of its Spark big data analytics system under an Apache Open Source license with support from the NSF, DARPA and industrial partners, the Big Data Top 100 list, a community-based big data benchmarking initiative, and a $750,000 community-based big data effort by the National Institute of Standards and Technology.

Tuesday's announcements also noted several projects focused on the economy, sustainability and cities. Clean Tech San Diego, a non-profit membership organization, is working with the Predictive Analytics Center of Excellence (PACE), the San Diego Supercomputer Center (SDSC) at the University of California, San Diego (UCSD) and the OSIsoft company to establish a data infrastructure that can connect the systems managing waste, buildings, transportation and traffic, allowing the City of San Diego to develop city-scale applications to lower electricity consumption and cost.

In New York City, the White House pointed out how the Mayor's Office of Data Analytics worked with the city's New Business Acceleration Team to analyze the city's performance in helping new restaurants navigate city bureaucracy by drawing on data from construction permits, restaurant inspections and NBAT counseling notes, and concluding that the team's free services helped restaurants open 45 days earlier.

In Boston, the White House highlighted a new series of competitions from the MIT Computer Science and Artificial Intelligence Laboratory to explore how scientists can use big data to address societal issues, beginning with a data challenge focused on transportation in downtown Boston in partnership with the City of Boston.

Other initiatives include a new American Energy Data Challenge from the Department of Energy and a research project examining the role of data complexity and volume for the stability, openness and fairness of financial markets with a grant from the NSF.

Several of the additional projects announced Tuesday also build on university research and education initiatives. The MIT Big Data Initiative will launch the Big Data Privacy Working Group, inviting stakeholders from academia, industry, governments and nonprofits to examine the implications of big data for privacy.

In early 2014, a new Council for Big Data, Ethics, and Society will launch, in collaboration with the NSF, to provide "critical social and cultural perspectives on big data initiatives" by bringing together researchers from disciplines ranging from anthropology and philosophy to economics and law. The co-directors of the council will be Danah Boyd from Microsoft, NYU and the Berkman Center, Geoffrey Bowker, professor of Informatics at the University of California, Irvine, Kate Crawford from Microsoft Research and the MIT Center for Civic Media, and Helen Nissenbaum, professor of Media, Culture and Communication and Computer Science at NYU.

The White House also highlighted a big data and bioengineering program at the University of Illinois, a five-year research collaboration by NYU, the University of California, Berkeley, and the University of Washington focusing on how to maximize the impact of data science on academic research, the University of Chicago's Eric & Wendy Schmidt Data Science for Social Good Summer Fellowship, a new IBM tool to help university students assess their readiness for public and private sector data and analytics jobs and recent TechAmerica Foundation roadshows focused on the impact of big data on the health and energy sectors.

Another focus of Tuesday's announcements were several health initiatives.

The Office of Science and Technology Policy is launching a Predicting the Next Pandemic initiative that will establish a consortium consisting of government agencies, non-governmental organizations, academic institutions, industry partners and others to explore how big data could help predict pandemics before they occur, according to the White House fact sheet. "Participants will begin with a pilot project that will explore the drivers of biological events and determine how big data can be used to predict pandemic potential of novel infectious agents," the fact sheet notes. "Existing data collection capabilities and requirements will be examined based on historical and recent disease emergence, data fusion, barriers to collaboration, and big data analytics to allow the future development of a predictive capability."

In an additional announcement, Novartis, Eli Lilly and Pfizer will be working together to create a new platform with the aim of improving access to information about clinical trials, complementing the existing clinicaltrials.gov. A machine-readable "target health profile" will make it easier for healthcare software to match individual health profiles to appropriate clinical trials, according to the White House fact sheet. Patients will be able to search for trials using their own personal Blue Button data, the fact sheet further notes, referring to the personal health records system adopted by many federal agencies.

The White House also highlighted a new Virtual Research Data Center for the Center for Medicare and Medicaid Services, further NIH funding to expand access to biomedical data, an international project from MedRed and BT aimed at improving the dissemination of open health data, CancerLinq, a new system from the American Society of Clinical Oncology analyzing clinical trial data, an NIH-sponsored research initiative looking at how data analytics could help the early detection of heart disease, an SAP funded examination of how real-time analytics could help uncover genetic variants contributing to population health and disease, an NSF funded project resulting in new algorithms to help with the assessment of therapies for the aging population, a new patient owned co-operative helping patients manage their health data and provide it anonymously for clinical research uses and a genomic sequencing research initiative based at George Washington University Medical Center.