Personal Democracy Plus Our premium content network. LEARN MORE You are not logged in. LOG IN NOW >

House Publishes U.S. Code in XML

BY Miranda Neubauer | Tuesday, July 30 2013

The House of Representatives is now making the United States Code available for download in XML format, Speaker John Boehner's office announced today.

Transparency advocates like Joshua Tauberer, creator of Govtrack, welcomed the move, but are still waiting on the publication of legislative data in bulk format.

The Speaker's press release notes that the data is compiled, updated and published by the Office of Law Revision Counsel and is available for download as individual titles or in bulk.

The press release points out that the "House created the Legislative Branch Bulk Data Task Force in 2012 to expedite the process of providing bulk access to legislative information and to increase transparency for the American people."

"[The U.S. Code data] is really good example of this kind of project done right," Tauberer said. "The documentation is very comprehensive and detailed and really one of the best examples of documentation for a government XML standard that I've ever seen. The data is structured in a coherent, natural way."

He said that the new format would make it easier as a developer to process the hierarchy of the Code and access specific sections or elements of it in context, compared with what is currently possible through an HTML format.

The new tool will make it possible for Govtrack to offer a service allowing users to track elements of the Code and receive an alert any time a bill mentions a specific section, he said.

He noted that that Sunlight Foundation's Scout tool functions in a similar way, but that the new data will make such a tool easier to maintain and allow for comprehensive results. Govtrack had had a similar function up until 2011, he said, when he discontinued it because it was too hard to keep it up-to-date.

But for Tauberer, "the big elephant in the room" is the unavailability of legislative data. Currently, Govtrack and other transparency groups scrape such data from the Library of Congress' Thomas platform, a process that leads to inaccuracies and is hard to maintain. What is currently available in bulk format is the text of bills, he emphasized, but not their legislative status, meaning it isn't easy to create a spreadsheet of all bills passed or find out how many bills were passed by one chamber and not the other.

Tauberer suggested that the new XML release was the result of a larger internal House modernization project, even though it is being billed as a transparency initiative.

Citing previous advocacy efforts and discussions about what the Bulk Data Task Force was focused on, he said that while the work on XML was positive, "that's not the reason the task force was created," and warned against "losing sight" of the legislative data priority.

In September, on the occasion of the launch of Congress.gov, a Library of Congress spokesperson told techPresident that Congress had "not requested that data be provided in that manner."

In August, a report co-authored by Tauberer, the Sunlight Foundation and others welcomed the House Leadership's commitment to bulk data and outlined a path towards implementation.

FierceGovernmentIt reported on July 22 that a December 31 report by the task force was recently made available as part of the Legislative Branch appropriations bill.

"Consistent with the pledge by House Leaders, the Task Force recommends that it be a priority for Legislative Branch agencies to publish legislative information in XML and provide bulk access to that data; that the XML Working Group develop and maintain standards to ensure compatibility and interoperability of all machine-readable data published by the Legislative Branch, and that the Task Force be extended to the 113th Congress to continue to coordinate, initiate and track transparency-related projects," the report's executive summary reads.

News Briefs

RSS Feed thursday >

First POST: Hot Spots

How Facebook's Mark Zuckerberg is making inroads in China; labor protests among Uber drivers spread to more cities; new data about the prevalence of online harassment; and much, much more. GO

wednesday >

First POST: Reminders

Why the RNC hasn't managed to reboot how Republican campaigns use voter data; new ways of using phone banking to get out the vote; how the UK's digital director is still ahead of the e-govt curve; and much, much more. GO

tuesday >

First POST: Patient Zero

Monica Lewinsky emerges with a mission to fight cyber-bullying; Marc Andreessen explains his political philosophy; tech donors to MayDay PAC get pushback from Congressional incumbents; and much, much more. GO

monday >

First POST: Front Pagers

How Facebook's trending topics feed is wrecking political news; debating the FBI's need for an encrypted phone "backdoor"; democratizing crisis data; and much, much more. GO

friday >

First POST: Tracking

Questions about whether Whisper is secretly tracking its users' secrets; the FBI's continued push against the new wave of encrypted phones; community service, high-tech-mogul-style; and much, much more. GO

More