Wednesday, March 25, 2015

I tried to use the Internet to do historical research. It was nearly impossible.

I tried to use the Internet to do historical research. It was nearly impossible. February 17, 2015. 
How do you organize so much information? So far, the Internet Archive has archived more than 430,000,000,000 web pages. It’s a rich and fantastic resource for historians of the near-past. Never before has humanity produced so much data about public and private lives – and never before have we been able to get at it in one place. In the past it was just a theoretical possibility, but now we have the computing power and a deep enough archive to try to use it.

But it’s a lot more difficult to understand than we thought. "The ways in which we attack this archive, then, are not the same as they would be for, say, the Library of Congress. There (and elsewhere), professional archivists have sorted and cataloged the material. We know roughly what the documents are talking about. We also know there are a finite number. And if the archive has chosen to keep them, they’re probably of interest to us. With the internet, we have everything. Nobody has – or can – read through it. And so what is “relevant” is completely in the eye of the beholder."

Historians must take new approaches to the data. No one can read everything, nor know what is even in the archive. Better sampling, specifically chosen for their historical importance, can give us a much better understanding. We need to ask better questions about how sites are constructed, what links exist between sites, and have more focused searches. And we need to know what questions to ask.

No comments: