Personal Democracy Plus Our premium content network. LEARN MORE You are not logged in. LOG IN NOW >

The Library of Congress is Archiving 170 Billion Tweets — on Tape

BY Julia Wetherell | Monday, January 7 2013

The Library of Congress announces an update on the Twitter archive (on Twitter).

When the Library of Congress teamed up with Twitter in 2010 to archive four years’ worth of activity on the microblogging platform, the aim was to preserve a slice of early-millennial life to future researchers. Now the two-hundred-year-old institution is grappling with the resulting 133 terabytes of data, a bundle that includes every 140-character message sent out from Twitter’s six-year-history, from its inception in spring 2006 to December 2012.

While the initial agreement called for archiving up to 2010, the LOC subsequently determined to extend the project indefinitely, keeping up with the nearly half-billion tweets dispatched every day. There will be a six-month holdout before new tweets enter the archive, an interesting statement on our contemporary definition of “history.” The struggle now is for the LOC to create a keyword-searchable catalog for the vast amount of metadata associated with the archive, including the time and location that indicate a tie to certain events — election night on 2008, as one example. However, as the LOC’s recent report confirms, their agreement with Twitter states that the “Library cannot provide a substantial portion of the collection on its web site in a form that can be easily downloaded.” Therefore, it goes on to say, the archive will exist primarily in the physical realm – on tape.

The technical infrastructure for the Library’s Twitter archive follows the same general practices for monitoring and managing other digital collection data at the Library. Tape archives are the Library’s standard for preservation and long-term storage. Files are copied to two tape archives in geographically different locations as a preservation and security measure.

While this storage method seems oddly similar to creating nuclear-winter-proof seed vaults, it does anticipate the fact that, with all of our cloud-tending tech, we don’t exactly have a permanent record for our online life. For that, future historians may have the Library of Congress to thank.

The headline of this post has been corrected. An earlier version implied the Library of Congress report said Twitter posts were being stored on an analog media, which it does not say.

News Briefs

RSS Feed today >

Cory Booker Hires Democratic Organizing Veteran Addisu Demissie To Manage Senate Run

Newark Mayor Cory Booker has hired a veteran of the Democratic organizing world Addisu Demissie to manage his run to succeed the late New Jersey Democratic Senator Frank Lautenberg of New Jersey. GO

ShareProgress Debuts Social Sharing Optimization Tools

ShareProgress, a left-leaning tech startup in downtown San Francisco, launched its social sharing optimization platform Tuesday after several months of testing with the progressive advocacy group CREDO Action. GO

New Organizing Institute to Move from Collecting Election Data to Organizing Election Officials

The New Organizing Institute, a progressive nonprofit that trains campaigners and is no led by former Obama for America data director Ethan Roeder, is launching a new initiative next week aiming to "fix that" for local elections. NOI will announce a national network where local election administration officials can congregate to share solutions to common issues. It's a transition for a team at NOI that had previously been managing the Voting Information Project, which collects data on polling places, election districts and voter registration deadlines and prepares it for third parties in machine-readable format. In the 2012 election cycle, backed by the Pew Charitable Trusts and partnered with Google, VIP made information available in all 50 states. GO

Russian SOPA Passed First Reading

A first draft of a law nicknamed “Russian SOPA” was approved by the Russian parliament last Friday, June 14. Like the original Stop Online Piracy Act, the bill will establish penalties and procedures for online copyright violations.

GO

monday >

Czech Prime Minister Resigns Following Corruption and Surveillance Scandal

The prime minister of the Czech Republic resigned yesterday, irreparably damaged by a corruption scandal and the possibility of impropriety in his personal life. According to the Czech constitution, his entire government will also have to relinquish office.

GO

friday >

Mayors of New York City and San Francisco Announce "Digital Cities" Summit

The Mayors of New York City and San Francisco announced Friday that they're co-hosting meetings in the Fall and early next year to examine the "best practices" that lead to tech-enabled economic growth. The meetings are follow-ups to the initial Bloomberg Technology Summit held last year in New York City. This year's summit in New York ... GO

New York State Joins GitHub to Get Feedback on Open Data Policy

New York is the first state to publish an initial draft of its open data guidelines on GitHub to seek feedback from the public, Governor Andrew Cuomo announced in a press release Thursday. GO

Brazilians Protest Forced Evictions on YouTube and in Mock World Cup

Tomorrow Brazilians who have been forced out of their housing in advance of the 2014 World Cup will stage their own “People's Cup” in Rio de Janeiro to draw awareness to forced evictions.

GO

A “Fix-Rate” for Corruption: Integrity Action Wins the Google Global Impact Award

“From wanachi (“citizen”) to up there,” Emmanuel Dzombo explains with an upward sweep of his hand, is how Integrity Action has begun to reverse the bureaucratic top-down approach that has often blocked development work in Kenya. Dzombo is a local leader in Chengoni, Kenya, a country that ranks towards the very bottom of Transparency International’s Corruption Perceptions Index – at 139. The organization believes it could do more, and Google.org seems to agree. The Google Impact Challenge will provide the charity with £500,000 that will allow it to develop a mobile application for tracking and collecting data from citizens. GO

Crowdsourced "Danger Maps" Track Air, Soil and Water Pollution in China

Chinese citizens are exposing sources of pollution and other environmental problems by contributing to the partially crowdsourced website 'Danger Maps'. So far, the Chinese government is letting them get away with it.

GO

thursday >

U.S. Privacy and Civil Liberties Oversight Board To Meet Next Wednesday

A long dormant independent agency that was at least nominally supposed to exercise a modicum of oversight over the booming intelligence-industrial complex is scrambling to meet up next Wednesday, but the public will still be none the wiser about what it plans to do, since it is a closed door meeting. The only indication that the toothless ... GO

Despite Software Problems, Civic Hackers are Pedaling Bike Share Data

Reporters are shoaling around the news that New York City's new bike sharing system, Citi Bike, is benighted with problems stemming from its high-tech software. But that's not putting the brakes on plans to explore what programmers might do with data generated by the system by hosting a Citi Bike Civic Hack Night later this month. GO

Grassroots Republicans Are Not Waiting for the RNC To Revamp Their Digital Strategy

Several members of the Republican Party rank and file aren't waiting around for the GOP to reinvent itself on the technological front. They're organizing events themselves to explore what a tech-enabled GOP might look like for the 2014 cycle. GO

wednesday >

New Russian Law Makes Publication of Information on Gay Rights Illegal

On June 11 the Russian parliament passed a bill against “homosexual propaganda” that effectively outlaws gay rights rallies and bans informational or pro-gay rights material from publication in the media or on the Internet. Violators of the law will risk heavy fines and censorship and, in the case of a media outlet, risk being shut down. It had near unanimous support, passing in a 436-to-0 vote, with only one abstention.

GO

Macedonia Draft Law to Regulate and Restrict the "Last Arena for Freedom of Speech"

The draft of a media regulation law in Macedonia has journalists and press freedom watchdogs up in arms. The proposed Law on Media and Audiovisual Media Services was written by the government behind closed doors and without input from the media or NGOs. It has been interpreted as a decisive move on the part of the government to limit speech online in a country where press freedoms are already limited. Until now, Internet-based news sites were not regulated like print media.

GO

Trying to Prosecute Online Piracy in Canada? Good Luck!

A private firm that is monitoring Canadians who download pirated content online has found itself at the center of a legal battle. GO

More