Library of Congress to Archive Twitter

All Tweets Submitted to the Web Site Will Become Searchable by Google


Published: April 22, 2010

Twitter is donating its entire archive of tweets to the United States Library of Congress, the social networking site announced April 14 via its blog.  The contribution includes all Tweets made public since the service’s March 2006 inception.  Twitter, a microblog famous for restricting messages to 140 characters or less, receives over 50 million posts a day on average, this according to a press release from the Library of Congress.

The Web service has recently grown to notoriety as a practical tool for communication; it has been used by dissidents protesting the 2009 Iranian elections and more recently as forum to debate Supreme Court nominations. Other memorable tweets, the Library of Congress revealed, include the very first, by Twitter’s co-founder Jack Dorsey, and Barack Obama’s election-night victory message.

“Over the years, tweets have become part of significant global events around the world—from historic elections to devastating disasters,” Twitter’s cofounder, Biz Stone, wrote on the company’s blog. “The open exchange of information can have a positive global impact.  This is something we firmly believe and it has driven many of our decisions.”

The Library will overlook the data as part of a Congressionally-mandated program to archive digital information. An article from the New York Times cites that over 167 terabytes of data have already been captured, equivalent to the institution’s entire collection of books, around 21 million in all. Alongside preservation, the Library will allow access to the cache for academic research, primarily as a cache of social, political and cultural sources. As it states on its Web site, the donation follows a Library tradition of collecting accounts from the “man the on street.”

Yet, as Paul Saffo, a Stanford scholar specializing in technology’s effect on society, stated to the New York Times, “Your indiscretions will be able to be seen by generations and generations of graduate students.”

This has raised anxiety among Web-users. “It seems like a serious breach of trust on the part of Twitter,” Frank Romano, FCLC ’13, said. “No one who posted thought their words would enter the historical record.”

Included in Twitter’s announcement, however, were several conditions to preempt privacy concerns.  It states that tweets will only be stored after a six-month delay and will be limited to “internal library use, non-commercial research, public display by the library itself and preservation.”

Still, others doubt the donation’s significance.  Michael Landon, FCLC ’13, considered it “utterly useless.” His disapproval was echoed by Charles Chawalko, FCLC ’10, who believes that “Twitter does not belong anywhere near the [writings of the] Founding Fathers or the works of Abraham Lincoln.”

Nevertheless, all public Twitter posts will  be made searchable by Google, which has introduced ‘Replay,’ a search engine able to pinpoint Tweets down to the minute they were posted.  Revealing the tool in its official blog, Google is currently allowing searches as far back as February 11, 2010, but will eventually extend this to “the very first Tweet.”

According to Google, Twitter “creates a history of commentary that can provide valuable insights into…how people react. We want to give you a way… to make it useful.”