Advertisement
Tech

The monumental task of archiving the Trump internet

The record, fragile and unwieldy, is at great risk of disappearing.

Photo of Celeste Kaufman

Celeste Kaufman

Donald Trump at microphone with deranged face, with other photos of him in the background facing different directions

The past four years have been an inflection point in American history, and President Donald Trump will be studied extensively by experts seeking to look back on an era defined by the internet. Yet, this is the president whose historical record is most at risk of disappearing.

Featured Video

Partly due to coincidental timing with how technology is evolving regardless of who’s in the White House, and partly due to fans who bragged about memeing their way to the presidency, the most crucial materials of this historical record are online—and easily lost.

In order to understand the Trump presidency, one must understand Twitter fights and bots; fake news and deep fakes; 4chan and the #resistance; and Pepe the Frog and “Pantsuit Nation.” But these records, if they exist at all, are ephemeral, fragile, and unwieldy. The average lifespan of a website is not very long, and social media content is even more fleeting and often private. 

Archiving Trump’s social media presence—and the online context around his presidency—will help historians catalog years shaped by social media’s dark side. As Trump himself tweeted in 2017, his “use of social media is not presidential—it’s MODERN DAY PRESIDENTIAL.”

Advertisement

A nation that can’t agree on underpinning facts must make judgment calls on how to frame a dizzying four years. And so, groups of government organizations, nonprofits, and independent archivists have been taking on the Sisyphean task of recording the collective American experience of the past four years so historians can study genuine artifacts unfiltered by opinion or agenda. 

To start with the official government record, a group of organizations including the Library of Congress and the Internet Archive are undertaking their “End of Term Crawl,” the process by which web crawlers digitally archive web pages from government entities over the course of a presidency.

Beyond that, what’s considered part of the public record is a gray area. For a president whose every tweet made news, the question of whether or not Trump’s tweets fall under the purview of the Presidential Records Act of 1978, and therefore must be preserved, is on the table.

The Trump administration has stated that it’s following all laws and regulations relating to the preservation of its social media presence, but like most social media archiving, the work has largely fallen to independent projects like the Trump Twitter Archive and Factba.se. These records include everything up to Trump’s permanent suspension earlier this month, however the videos of his remarks about the insurrection that led to his suspension are unable to be played.

Advertisement

Meanwhile, the U.S. National Archives also said last week that it will “receive, preserve, and provide access to all official Trump Administration social media content” from Trump’s now-suspended @realDonaldTrump account and the @POTUS account.

The National Archives said the White House has used “an archiving tool that captures and preserves all content” and that the information will be transferred to them this week and put on a “Trump Library” website.

The magnitude of preserving Trump’s own social media content alone has been challenging, but the task of archiving all social media content related to him has proved impossible. The Library of Congress admitted defeat in maintaining a complete Twitter archive less than a year into Trump’s presidency, and since then the go-to method within and outside of the Library has been indexing tweets under particular topics deemed worthy of study. 

DocNow, a social media archiving organization, hosts what’s known simply as the Catalog, a collection of Twitter datasets organized under categories like “#metoo digital media collection,” or “The 1619 Project.” It contains 113 sets of over 2.5 billion tweets, but an archive that is not comprehensive is inherently subjective. There are many uphill battles when it comes to preserving social media, and editorializing which conversations and points of view deserve to be archived is just one of them.

Advertisement

Social media platforms’ business interests and terms of service often conflict with the goals of an archivist, and independent researchers have to overcome not only technical challenges but ethical concerns for rounding up materials that their owners might consider private, even when posted publicly. 

Projects tend to focus on Twitter, the modern first draft of history, but the content behind locked profiles on sites like Facebook or alternative platforms like Parler is largely going uncataloged. Message boards, another crucial forum for conversation related to the Trump presidency that rely on anonymity to thrive, are similarly inaccessible. 4Chan and its offspring are designed to disappear, deleting all posts in a thread past 15 pages. A few archiving projects, like 4plebs, try to counteract that, but inevitably primary sources are falling through the cracks.

Memes are being archived surprisingly well though.

All across the political spectrum, memes have been a tool to entertain, lionize, humiliate, and enrage Trump, his supporters, and his opponents. What used to be the currency of only those who are extremely online started making headlines as the president engaged with them on Twitter for all to dissect. 

Advertisement

Memes are a critical tool for understanding how the American people processed the Trump presidency. The website, Know Your Meme, is diligently organizing the history of the last five years through a meme lens. 

“It might sound silly, but I can totally see future historians using Know Your Meme as a source for understanding what was happening during this period in time,” Don Caldwell, editor-in-chief of the site, said. “Memes communicate ideas in a language that goes beyond what can be captured with traditional media, and they should be documented.”

His team not only traces the origin of individual memes, but creates encyclopedic entries of memes relating to a particular topic, including Trump and the elections of 2016 and 2020.  The site is then regularly updated in the Library of Congress’ Web Culture Collection and the Internet Archive.

In the process, Know Your Meme is also helping to archive social media content and 4chan posts, using tools like Archive.IS to create a more permanent snapshot of meme examples to authenticate their work.

Advertisement

The archives of social media may be largely incomplete, but this organizing of data into thematic, narrative collections will make a historian’s work easy… compared to making sense of the incomprehensible amount of materials created by attempts to preserve the internet as a whole, at least.

The Web Culture Collection alone has over 18 billion digital documents, and the Internet Archive—the most comprehensive project—processes tens of millions of web pages a day. There is a tremendous amount of work dedicated to safeguarding these fragile materials from becoming obsolete in the face of evolving technology, like the Internet Archive’s recent success in creating a Flash emulator to enable people to play defunct Flash animations in archived materials that would have otherwise been lost for good.

But the job of encapsulating the entire internet leaves little time for focusing on making the massive amount of data user-friendly. Historians will have to already know exactly what they’re looking for in order to even dip a toe into the oceanic archives.

There will also be another hurdle when studying the Trump era: Determining which of those materials are real, and which are fake. 

Advertisement

“Everyone is trying to figure out what the right balance is here,” said Mark Graham, director of the Wayback Machine at the Internet Archive. “We have a sense of responsibility to history so we want the material to be made available, but we are also working to provide context for it.” 

They recently introduced context banners, notices at the top of archived pages that alert the viewer to things like updates and retractions, but wouldn’t exactly label a media story as fake news. The editors at Know Your Meme include sections on Twitter bot activity and astroturfing where applicable, but other projects focused solely on data mining have noticed Twitter’s new disinformation tags are not included in a tweet’s metadata.

But the very act of preserving these materials is also a preemptive strike against future claims that these past four years did not actually happen the way they did. 

“This is exactly why archiving is such an important practice,” Caldwell said. “We need to preserve authentic, verified evidence of what went down.”

Advertisement

Archiving the internet of the Trump era is a messy, imperfect project largely undertaken by scrappy organizations and determined amateurs, but its importance cannot be overstated. 

“The vast majority of human conversations take place online,” Graham said. “If we care about our history and our future then we should care about archiving, we should care about access to information.”


Read more of the Daily Dot’s tech and politics coverage

Nevada’s GOP secretary of state candidate follows QAnon, neo-Nazi accounts on Gab, Telegram
Court filing in Bored Apes lawsuit revives claims founders built NFT empire on Nazi ideology
EXCLUSIVE: ‘Say hi to the Donald for us’: Florida police briefed armed right-wing group before they went to Jan. 6 protest
Inside the Proud Boys’ ties to ghost gun sales
‘Judas’: Gab users are furious its founder handed over data to the FBI without a subpoena
EXCLUSIVE: Anti-vax dating site that let people advertise ‘mRNA FREE’ semen left all its user data exposed
Sign up to receive the Daily Dot’s Internet Insider newsletter for urgent news from the frontline of online.
Advertisement
 
The Daily Dot