Personal Democracy Plus Our premium content network. LEARN MORE You are not logged in. LOG IN NOW >

The Library of Congress is Archiving 170 Billion Tweets — on Tape

BY Julia Wetherell | Monday, January 7 2013

The Library of Congress announces an update on the Twitter archive (on Twitter).

When the Library of Congress teamed up with Twitter in 2010 to archive four years’ worth of activity on the microblogging platform, the aim was to preserve a slice of early-millennial life to future researchers. Now the two-hundred-year-old institution is grappling with the resulting 133 terabytes of data, a bundle that includes every 140-character message sent out from Twitter’s six-year-history, from its inception in spring 2006 to December 2012.

While the initial agreement called for archiving up to 2010, the LOC subsequently determined to extend the project indefinitely, keeping up with the nearly half-billion tweets dispatched every day. There will be a six-month holdout before new tweets enter the archive, an interesting statement on our contemporary definition of “history.” The struggle now is for the LOC to create a keyword-searchable catalog for the vast amount of metadata associated with the archive, including the time and location that indicate a tie to certain events — election night on 2008, as one example. However, as the LOC’s recent report confirms, their agreement with Twitter states that the “Library cannot provide a substantial portion of the collection on its web site in a form that can be easily downloaded.” Therefore, it goes on to say, the archive will exist primarily in the physical realm – on tape.

The technical infrastructure for the Library’s Twitter archive follows the same general practices for monitoring and managing other digital collection data at the Library. Tape archives are the Library’s standard for preservation and long-term storage. Files are copied to two tape archives in geographically different locations as a preservation and security measure.

While this storage method seems oddly similar to creating nuclear-winter-proof seed vaults, it does anticipate the fact that, with all of our cloud-tending tech, we don’t exactly have a permanent record for our online life. For that, future historians may have the Library of Congress to thank.

The headline of this post has been corrected. An earlier version implied the Library of Congress report said Twitter posts were being stored on an analog media, which it does not say.

News Briefs

RSS Feed today >

Facebook Seeks Approval as Financial Service in Ireland. Is the Developing World Next?

On April 13 the Financial Times reported that Facebook is only weeks away from being approved as a financial service in Ireland. Is this foray into e-money motivated by Facebook's desire to conquer the developing world before other corporate Internet giants do? Maybe.

GO

The Rise and Fall of Iran's “Blogestan”

The robust community of Iranian bloggers—sometimes nicknamed “Blogestan”—has shrunk since its heyday between 2002 – 2010. “Whither Blogestan,” a recent report from the University of Pennsylvania's Iran Media Program sought to find out how and why. The researchers performed a web crawling analysis of Blogestan, survey 165 Persian blog users, and conducted 20 interviews with influential bloggers in the Persian community. They found multiple causes of the decline in blogging, including increased social media use and interference from authorities.

GO

tuesday >

Weekly Readings: What the Govt Wants to Know

A roundup of interesting reads and stories from around the web. GO

Russia to Treat Bloggers Like Mass Media Because "the F*cking Journalists Won't Stop Writing"

The worldwide debate over who is and who isn't a journalist has raged since digital media made it much easier for citizen journalists and other “amateurs” to compete with the big guys. In the United States, journalists are entitled to certain protections under the law, such as the right to confidential sources. As such, many argue that blogging should qualify as journalism because independent writers deserve the same legal protections as corporate employees. In Russia, however, earning a place equal to mass media means additional regulations and obligations, which some say will lead to the repression of free speech.

GO

Politics for People: Demanding Transparent and Ethical Lobbying in the EU

Today the Alliance for Lobbying Transparency and Ethics Regulation (ALTER-EU) launched a campaign called Politics for People that asks candidates for the European Parliament to pledge to stand up to secretive industry lobbyists and to advocate for transparency. The Politics for People website connects voters with information about their MEP candidates and encourages them to reach out on Facebook, Twitter or by email to ask them to sign the pledge.

GO

monday >

Security Agencies Given Full Access to Telecom Data Even Though "All Lebanese Can Not Be Suspects"

In late March, Lebanese government ministers granted security agencies unrestricted access to telecommunications data in spite of some ministers objections that it violates privacy rights. Global Voices reports that the policy violates Lebanon's existing surveillance and privacy law, Law 140, but has gotten little coverage from the country's mainstream media.

GO

friday >

In Google Hangout, NYC Mayor de Blasio Talks Tech and Outer Borough Potential

New York City Mayor Bill de Blasio followed the lead of President Obama and New York City Council member Ben Kallos Friday by participating in a Google Hangout to help mark his first 100 days in office, in which the conversation focused on expanding access to technology opportunities through education and ensuring that the needs of the so-called "outer boroughs" aren't overlooked. GO

thursday >

In Pakistan, A Hypocritical Gov't Ignores Calls To End YouTube Ban

YouTube has been blocked in Pakistan by executive order since September 2012, after the “blasphemous” video Innocence of Muslims started riots in the Middle East. Since then, civil society organizations and Internet rights advocacy groups like Bolo Bhi and Bytes for All have been working to lift the ban. Last August the return of YouTube seemed imminent—the then-new IT Minister Anusha Rehman spoke optimistically and her party, which had won the majority a few months before, was said to be “seriously contemplating” ending the ban. And yet since then, Rehman and her party, the conservative Pakistan Muslim League (PML-N), have done everything in their power to maintain the status quo.

GO

The #NotABugSplat Campaign Aims to Give Drone Operators Pause Before They Strike

In the #NotABugSplat campaign that launched this week, a group of American, French and Pakistani artists sought to raise awareness of the effects of drone strikes by placing a field-sized image of a young girl, orphaned when a drone strike killed her family, in a heavily targeted region of Pakistan’s Khyber-Pakhtunkhwa Province. Its giant size is visible to those who operate drone strikes as well as in satellite imagery. GO

Boston and Cambridge Move Towards More Open Data

The Boston City Council is now considering an ordinance which would require Boston city agencies and departments to make government data available online using open standards. Boston City Councilor At Large Michelle Wu, who introduced the legislation Wednesday, officially announced her proposal Monday, the same day Boston Mayor Martin Walsh issued an executive order establishing an open data policy under which all city departments are directed to publish appropriate data sets under established accessibility, API and format standards. GO

YouTube Still Blocked In Turkey, Even After Courts Rule It Violates Human Rights, Infringes on Free Speech

Reuters reports that even after a Turkish court ruled to lift the ban on YouTube, Turkey's telecommunications companies continue to block the video sharing site.

GO

More