Sunday, April 29, 2012

An Overview of Web Archiving.

An Overview of Web Archiving. Jinfang Niu. D-Lib Magazine. March/April 2012.
An article on the methods used at a variety of universities, and other institutions to select, acquire, describe and access web resources for their archives. Some notes from the article:
  • Web archiving is the process of gathering up data that has been recorded on the World Wide Web, storing it, ensuring the data is preserved in an archive, and making the collected data available for future research.
  • The workflow of web archiving includes appraisal and selection, acquisition, organization and storage, description and access. This workflow is the core of web archiving.
  • Creating a web archive presents many challenges,  
  • When archiving web content through web crawling programs, selection decisions are the basis for compiling a site list to crawl and configuring crawler parameters. Crawling may replace deposit for some things.
  • In acquiring web resources, the decision of whether to seek permission from copyright owners depends on the legal environment of the web archive, the scale of the web archive, and the nature of archived content and the archiving organization.
  •  Web archives need to preserve the authenticity and integrity of archived web content. The concept of provenance is important. 
  • The library must decide how it will generate, store and use metadata. Also, how it will make this available to others.

No comments: