Clash of the Robots (Files): Conflating Code and Closed Government
BY Nancy Scola | Thursday, February 19 2009
You might remember that the short-and-sweet robot.txt file on the new Obama-era WhiteHouse.gov was heralded as the opening bells on a new era of open government. So it's perhaps not surprising that CNET security watchdog Christopher Soghoian is asking whether a restrictive robots.txt file on the new Recovery.gov site is "evidence that the administration's much-publicized commitment to transparency is simply hype?" Those are mighty strong words. Are they fair ones?
To step into the weeds, the robots.txt file on site launched Tuesday was set to block Google and other search engines from spidering Recovery.gov:
# Deny all search bots, web spiders
User-agent: *
Disallow: /
Wrote Soghoian: "Although the site is advertised as proof of the president's commitment to transparency, its technical design seems to betray that spirit." He's since updated his post with a victorious note, noting that the robots.txt file was dropped from the site earlier today. Did Soghoian's post light a fire under responsible parties -- in this case, likely those inside OMB, who are taking the lead on the site -- to scramble to make changes to the robots.txt file? Perhaps.
But a source with inside knowledge of the situation says that the robots.txt file was set to block Google et al from crawling the site while it got its sea legs. Expecting huge interest at launch, the thinking was perhaps that reducing the server load was one way to help make sure the site stayed up and running. It was never meant to be a permanent way of doing business. Smarter technical minds will hopefully let me know if that's a reasonable way to approach a site launch, but it smells okay to me.
Getting Recovery.gov up and running, and weaving together the efforts of OMB, the White House, and other parties with a thumb in the pie, was something of a scramble. It's a bit much to call into question the commitment to transparency of an entire open-government endeavor -- nay, an entire presidential administration -- over three lines of temporary code on a two-day old website.
("Cup of Robots" image courtesy of striatic.)