The past is now digital

Historians grapple with the scale and scope of the online world.

Interview by Joanna Dawson

Posted September 21, 2020

In his recent book History in the Age of Abundance? How the Web is Transforming Historical Research, Ian Milligan explores how the historian’s practice is profoundly affected by Internet-based sources such as web pages, digitized materials, and social media. Canada’s History Society director of programs Joanna Dawson spoke with the University of Waterloo associate professor about the challenges and opportunities facing historians today and in the future.

Can you paint a picture of the world we’re living in as it relates to historical research? How will a future historian studying, for example, Canada’s Idle No More movement need to approach sources differently than someone studying a social movement of the 1960s?

In some respects, a historian studying a contemporary social movement in, say, ten or twenty years and a historian studying Canada’s 1960s (the topic of my first book) are not going to ask terribly different questions of their sources. Historians will still ask big questions: What historical processes were unfolding? What was the context of given documents and individuals? Why were certain events happening? What elements of dramatic change or surprising continuity might we learn about?

The difference, however, comes in how we approach and make sense of the sources before us, notably web archives and other born-digital records. Web archives are preserved copies of old web pages that you might find at places such as the Internet Archive; born-digital records, more broadly, are documents that began life as a digital object — such as a tweet, a Word file, or a website. These sources are different for two main reasons: their scale and their scope.

By scale, I refer to their sheer size: The Internet Archive alone currently has somewhere around nine hundred billion web pages and more than sixty petabytes of unique data (a petabyte being one thousand terabytes). Compared to a traditional library or archive, the Internet Archive represents a sheer accumulation of information the likes of which we have never before seen as a society. And, of course, alongside the Internet Archive are many other national libraries (including Library and Archives Canada, the British Library, and the U.S. Library of Congress) that are collecting terabytes of their own.

Related to this is the issue of scope. Imagine who is publishing these websites or tweets. Corporations and governments are part of this new record, but, more importantly to me, so are everyday people sharing their thoughts, loves, passions, hatreds, commentary, and beyond.

Your example of Idle No More bears this out. Alongside traditional sources of historical knowledge — from government records, to media, to interviews — researchers in the future will be able to draw on hundreds of thousands of tweets, blogs, activist websites, and camera video footage uploaded to the cloud. Not all of it will be preserved; but, at the scale we are looking at, even a small minority will dwarf any previous record.

What are the main opportunities of studying history in the digital age, and what are the main challenges?

The opportunities and challenges are two sides of the same coin. Scale and scope, noted above, represent the key opportunities before us. Just imagine: By being able to draw on all of these voices who have been traditionally left out of the archival record — everyday people, activists, people tweeting about their classes or meals, and beyond — we will all benefit from this more democratic source base.

I don't want to seem unduly utopian; the source base is still biased. For example, people are more or less represented within this new digital record due to digital divides centred around race, ethnicity, class, and beyond. This has been true in the 1990s, and it is true today. Yet bias is present in the production, preservation, and access of all historical records. Good historians always think about how their sources might be influenced.

The challenges are considerable. Most historians don’t think about gigabytes of data, let alone terabytes or petabytes. Doing this requires a few things: first, high-performance computing — think servers, cloud computing, code that is optimized to run quickly — and, secondly, the training and skills to effectively use this sort of infrastructure.

Part of your book explores the ethical concerns around using web archives for historical research. How are these concerns unique to born-digital sources?

I really grapple with this throughout the book, because the ethical concerns are so complicated. The expansive scope of web archives mean that they need to be handled with care. Web archives are full of material created by everyday people who for the most part have no idea that the web page they wrote in 1998 or the discussion-board post they uploaded in 2003 might still remain for anybody to read and to study.

One example that puts this into perspective for me is a post, which I found in a web archive dating back to December 1995, by an eleven-year-old boy writing about one of his favourite video games. I know it was an eleven-year-old boy, because it was me! Before web archives, the odds of the random musings of a child ending up in a library or archives was, for very good reason, very low. But now this kind of ephemeral material is collected as a matter of course, which is fantastic for a social or childhood historian but also raises all sorts of interesting and thorny ethical questions.

Even more interestingly, thanks to the expansive legal-deposit powers of national libraries in many European countries like England, France, or Denmark, an eleven-year-old kid writing about games in those countries would see their digital posts preserved by law in perpetuity in these national memory institutions. This is uncharted territory for the historical record.

I never gave consent as an eleven-year-old child, however; nor did I think about the future life of this material. In 1995, the Internet Archive didn’t yet exist, and neither did modern search engines. The world of discoverability has changed over the last two decades. Now I can google to find personal information on the web within seconds.

There are no easy answers for how historians need to approach this new world. With traditional archives, historians rely on the hard work of archivists in assembling these collections. There are donor agreements, access restrictions if need be, and the donor individual, family, or organization should realize that if they gave their material to an archive it might be used. The scale and scope of the Internet Archive and other web archives make that more or less impossible, although there are opt-out mechanisms. (If you find something you want to have removed from an archive, you can write to them.)

I therefore approach working with web archives a bit like oral historians consider their ethical approaches: The onus is on historians. I’m not sure if formalizing the process with an institutional review board, like oral historians at Canadian universities need to do, is the right approach; but historians need to consider factors such as whether material was posted in public at the time as well as the identifiability of the people we write about in our archives.

What changes do you think are needed within universities to ensure that future historians are trained sufficiently to work with web archives?

Let me be provocative — historians love content, and, if you look at the way university courses are arranged, they are very content-heavy. Think post-Confederation Canadian history, not close reading of government memos. Universities offer a few methods courses, but they’re rare.

We provide so much flexibility for students to follow their passions that I don’t think we always give them enough structure to learn how to be great historians. In other words, I worry that we don’t scaffold knowledge so as to sufficiently train historians in many aspects of the trade. That’s especially challenging when it comes to web archives. All historians need technical knowledge now, whether they want to study the 1990s, or even if they are studying nineteenth-century European or even Ontario history.

To deal with this digital age, we need specific skills training. Personally, I would have mandatory courses — at least one in each of the second, third, and fourth years — to teach some of these key components of being a historian.

What kind of new digital tools are available to historians today? For example, we’ve digitized all of past issues of The Beaver and Canada’s History magazine using optical character recognition, so that the text can be searched. How could a historian approach this material differently when using new research methods?

That’s a perfect question, and I think it dovetails with what we were just discussing. From your flipbook interface, you can imagine doing some pretty good needle-in-a-haystack-type searches. What really excites me, however, is the potential of downloading the PDFs of hundreds of issues.

Scholars could use digital methods to pull out all of the images, for example, and then see how they have changed over the last hundred years; or they could use large-scale text analysis to see how understandings of history have changed here in Canada.

When do we see social-historical issues mirrored within the pages of The Beaver, or increasingly diverse authors — and so forth. It’s really exciting!

The title of the book is History in the Age of Abundance? Why does it end with a question mark?

The question mark suggests that we are not in a settled place. My book is an early contribution to this ongoing reshaping of the historical profession, and the conversation isn’t over. A book published in 2019 isn’t going to be a decisive settling of what history will look like in the age of abundance but, rather, the opening of a conversation.

I’m thrilled that through this interview, and through other opportunities to talk about the work with both academic historians and those in the broader historical community, the conversation seems to be accelerating.

Skip social share links

Related to Books