Every Bill Coming Before the House Should Soon Be Available Online in Machine-Readable Format
BY Miranda Neubauer | Tuesday, January 17 2012
As of late last week, the House of Representatives began publishing some key legislative documents in machine-readable format at http://docs.house.gov, fulfilling a promise that had been announced last year. Going forward, the site will host a machine-readable version of every bill coming before the House, and currently hosts another structured set of data on all the bills coming before the House in a given week.
The availability of upcoming bills directly from the House as a structured data feed means that a developer could point a web application to that feed and grab official information about the goings-on of the House of Representatives that week, making it easier to understand the House and to follow along. It also means that going forward, every bill coming before the House will be made available in machine-readable XML format. Programmers and people who know from such things may now discuss the use of XML versus other formats for packaging structured data. It's unclear what interface the House will offer for developers to reach backwards through time to pluck individual bills from this repository as it grows. But the House leadership is beginning to deliver on its pledge to make it ever easier for developers and technologists to build tools that explain the workings of Congress, and to do so relying on the House as a primary source.
The Committee on House Administration unanimously adopted the "Standards for the Electronic Posting of House and Committee Documents & Data" in December, which among other guidelines notes that "committees are encouraged to post documents in XML when possible and should expect XML formats to become mandatory in the future."
The site is hosted by the House Clerk, with data coming from the House Majority Leader and the House Committee on Rules in addition to the House of Representatives.
Matt Lira, digital director in Majority Leader Eric Cantor (R-Va.)'s office, called the availability of machine-readable legislation a step as significant as allowing cameras on the House floor. He calls it a structural change to the functioning of the House.
"It's not just opening a door, it's installing a doorstop so that the door can never be closed in that way again," he said. With the role of the House Clerk, he said, it's the House as an institution making its information more available, independent of the personalities involved.
One key goal of the site is to make it easier to see what's in legislation before it is voted on, he said. As an example, he noted that in 2009, stimulus legislation was posted in a non-machine readable format two hours before a vote. Now, he said, legislative texts would be searchable by keyword and available for developers. "Government is pretty lousy at interfaces, but pretty good at data," he said, and added that the site could become a useful for source for media outlets or activist groups.
While currently mainly posting bills to be considered by the House, new standards also direct the inclusion of Committee documents in the future.
"Committee video of hearings and markups will be stored by the House to meet requirements for archiving, access, searchability, and authenticity," the standards also note.
Daniel Schuman from the Sunlight Foundation wrote that "the ongoing process of releasing documents online, in real-time, and in machine-readable manner is a tremendous sea change from the slow and ponderous paper publications that are often late, fairly difficult to use, and unfriendly to computers."
But J.H. Snider, president of iSolon.org and a network fellow at Harvard University’s Edmond J. Safra Center for Ethics, wrote in an e-mail that he still saw room for improvement.
"My basic complaint is that the data is machine-readable only internal to the U.S. House of Representatives, not across the federal government," he wrote. "Until the data can be automatically linked to related databases, the value of machine-readable information is significantly diminished. For example, I’d like the bill text to be automatically linked to the statutes they modify."
That might require some forward movement from places like the Government Printing Office, which maintains a current digital system for accessing federal regulations that relies heavily on PDFs.
Snider called the new site at least a small step in the right direction. He also suggested that in addition to sharing and RSS tools, an e-mail subscribe option should also be available.
Lira said that the offices involved in the site would look at analytics, and at feedback from the transparency community, to shape the site going forward.
In a statement, Rules Committee Chairman David Dreier (R-Calif.) praised the launch of the new website.
"When Republicans took the majority last year, we promised to change the way the House of Representatives conducts the people’s business," he said. "That’s why we adopted rules that for the first time promote the use of electronic files rather than paper printed at taxpayer expense, cutting costs and increasing public access."
This post has been updated.