Personal Democracy Plus Our premium content network. LEARN MORE You are not logged in. LOG IN NOW >

Google Calls on Government to Make Itself Available to Search

BY Nancy Scola | Wednesday, June 24 2009

One of the participants in the White House's ongoing Open Government Initiative process is a little company by the name of Google, and it has some ideas to share with the executive branch on how government information can make itself more searchable and thus more accessible to the public. In comments submitted by Google Managing Policy Counsel Pablo Chavez (via the fusty old-fashioned Federal Register channel, rather than the blog/wiki-enabled online OGI process), Google makes the case that consumers and citizens are very often going to use Google as an interface onto government information, rather than any one .gov website or data tool. Government data that is hidden to Google and other search engines is effectively hidden from many of the people whom it might benefit and inform. From Google's submitted comments:

If a citizen is looking for specific information on the safety of organic food, it is more likely that a user will type in their query into a search box (e.g. 'organic food safety') than navigate directly to a relevant federal government website like that of the U.S. Department of Agriculture.

Citizens assume that search queries that are returned by search engines are complete. Therefore, it is critical that agency websites to be easily indexed and crawlable by search engines. If government websites do not allow search engines to crawl, or certain documents on websites are hidden by robots.txt files or behind databases, the results available to citizens are incomplete and not as helpful as they otherwise could be.

Google has two recommendations in particular on how government can unlock data to search. The first is to adhere to the XML Sitemap protocol. Whereas HTML sitemaps help users navigate sites, XML sitemaps can provide computers with meta data that help make sense of the meaning of the site's content and structure. For example, an XML sitemap can help Google's spidering agents understand which documents on a website are worth dedicating more attention to, and when the various sections of a site have been last updated. Google notes that several states, such as Virginia, California, Utah, Michigan, and Arizona have adopted XML-based sitemaps to unlock the impact of their data and other resources. In Google's submitted comments, Chavez reports that the introduction of an XML sitemap to one resource section on the website of the Department of Energy's Office of Scientific and Technical Information increased downloads of full-text documents by some 400%. "We should go back to the basics," writes Chavez, "and make the content that already exists accessible to citizens online."

The second recommendation that Google has for government is that agencies re-evaluate the text files residing on their servers that tell Google and other search engines which information they'd like to have included in search results and what they'd rather have ignored. Many agencies, writes Chavez, write their robots.txt file to err on the side of excluding information, when the presumption when it comes to government data should be on inclusion and disclosure. The company reports that it has been working with state-level agencies, such as the Florida Department of Education, to evaluate when the "no follow" and "no index" directions in robot.txt files are unnecessary and should be discarded. "This has resulted in tens of thousands of new pages containing deep content that are now discoverable to citizens through search," writes Chavez.

In addition to the utility and import of opening up a wider range of government data, having a deeper well of government data to draw from may be an advantage to Google as it seeks to stay one step ahead of search engines like Microsoft's Bing and Wolfram Alpha that take a less linear approach to retrieving and delivering information. Ola Rosling, the lead on the Google Public Data project we wrote about in this space on Monday, described in an email the company's interest in working with Data.gov and other first-party sources of government information: "We are open to collaborating with a broad range of public data providers, including the White House, to promote our broader goal to make public data accessible. We are also," he adds, "actively seeking contacts with new organizations that produce public data." That that latter end, Google has begun reaching out to public-data providers to ask them to fill out a detailed survey regarding what data it is they're currently publishing or are interested in publishing, on everything from trade figures to agricultural numbers to crime statistics.

News Briefs

RSS Feed yesterday >

"Power Politics in the Age of Google"

TechPresident's editorial director, Micah Sifry, will be speaking this afternoon on a panel at Harvard University called "Power Politics in the Age of Google," alongside Susan Crawford, Nicco Mele, Elaine Kamarck and Alexis Ohanian. The panel will be moderated by Harvard Shorenstein Center Director Alex Jones, and will be live-streamed here. GO

House Republicans Get a Jump on the Budget

Via Politico's Mike Allen, the House Republicans are out with a video — this one attributed to Majority Whip Kevin McCarthy — getting the drop on President Barack Obama's next federal budget, expected Monday. GO

Mittbucks.com Lets Voters Compare Their Paychecks With Romney's

What would it take for Mitt Romney to be able to relate to the average American's daily economic life? He'd have to pay $1,208.09 for a gallon of gas, according to Mittbucks.com, a web site recently created by Adam Rosenscruggs and his wife Danielle in Washington, D.C. The eye-popping figure results from an annual income that I plugged in ... GO

What Twitter Won't Tell You About the Election

A new study released on Tuesday by the Pew Research Center for the People & the Press on Tuesday offers the opportunity to get real about what the political conversation on Twitter and Facebook can — or can't — tell you about the progression of the 2012 political campaign. Pew has found that even among users of Twitter and Facebook, a paltry percentage of people use social networks to get news about politics: Only 24 percent of Twitter users in the sample and 25 percent of Facebook users said they "sometimes" got campaign news through that network, while a full 40 percent of Twitter users in the sample and 46 percent of other social media users reported "never" getting campaign news through either Twitter or Facebook. GO

Navigating New York's "Road Map for the Digital City," One Year In

In May 2011, New York City Mayor Michael Bloomberg revealed a "Road Map for the Digital City," a plan to use technology to make city government more and participatory, and to leverage the city's tech sector for economic and civic gains.

New York City Chief Digital Officer Rachel Sterne will join our editorial director, Micah Sifry, on a conference call this Friday afternoon to discuss the progress on that road map so far. The call is free and open to anyone to join. You can sign up here.

GO

tuesday >

Pete Hoekstra's Campaign Website's "Offensive" Source Code Changed After Outcry

As if "chop suey fonts" and obvious graphic allusions to the stereotype of the Chinese as the Yellow Peril weren't controversial enough, the group that created an incendiary microsite for former Rep. Pete Hoekstra's campaign has managed to further fan the flames with what it's calling a mistake in its code. GO

Fidel Castro Loves the Internet

“The Internet is a revolutionary instrument that permits the receiving and transmission of ideas, in both directions, that is something we should know how to use,” Fidel Castro told a crowd of supporters on Feb. 4, according to the state-owned Cuban newspaper Granma International. Castro, who made his first public appearance since April 2011, launched his two-volume memoir, “Guerilla of Time,” and took the opportunity to discuss issues of importance to him. Earlier this week, Miranda Neubauer reported that one of these topics was the need for the Internet. Castro has been a proponent of the Internet as a tool for the exchange of ideas since 2003, but the average Cuban citizen faces great difficulty getting online. GO

Claire McCaskill Hires Blue State Digital's Alex Kellner As Digital Director

Missouri's senior Democratic Senator Claire McCaskill has hired Blue State Digital's Alex Kellner as its digital director. GO

More