Monday, April 30, 2012

University of Utah Selects Ex Libris Rosetta for Long-Term Digital Preservation

University of Utah Selects Ex Libris Rosetta for Long-Term Digital Preservation. Press Release. April 30, 2012.
Ex Libris is pleased to announce that the University of Utah has opted for Ex Libris Rosetta to preserve the school's extensive cultural heritage collections, which include newspapers and other historical textual documents, photographs, rare books, oral history interviews (including transcripts and audio), motion picture collections, and more. In addition to cultural heritage collections, Rosetta will enable the University of Utah to preserve faculty publications and research data. The J. Willard Marriott Library hosts the collections of many campus departments and, as a member of the Mountain West Digital Library network, hosts collections belonging to other Utah institutions.

Sunday, April 29, 2012

An Overview of Web Archiving.

An Overview of Web Archiving. Jinfang Niu. D-Lib Magazine. March/April 2012.
An article on the methods used at a variety of universities, and other institutions to select, acquire, describe and access web resources for their archives. Some notes from the article:
  • Web archiving is the process of gathering up data that has been recorded on the World Wide Web, storing it, ensuring the data is preserved in an archive, and making the collected data available for future research.
  • The workflow of web archiving includes appraisal and selection, acquisition, organization and storage, description and access. This workflow is the core of web archiving.
  • Creating a web archive presents many challenges,  
  • When archiving web content through web crawling programs, selection decisions are the basis for compiling a site list to crawl and configuring crawler parameters. Crawling may replace deposit for some things.
  • In acquiring web resources, the decision of whether to seek permission from copyright owners depends on the legal environment of the web archive, the scale of the web archive, and the nature of archived content and the archiving organization.
  •  Web archives need to preserve the authenticity and integrity of archived web content. The concept of provenance is important. 
  • The library must decide how it will generate, store and use metadata. Also, how it will make this available to others.

Web Archives for Researchers: Representations, Expectations and Potential Uses.

Web Archives for Researchers: Representations, Expectations and Potential Uses. Peter Stirling, et al. D-Lib Magazine. March/April 2012.
Web archiving is one of the missions of the Biblioth√®que nationale de France. This study looks at content and selection policy, services and promotion, and the role of communities and cooperation.  While the interest of maintaining the "memory" of the web is obvious to the researchers, they are faced with the difficulty of defining, in what is a seemingly limitless space, meaningful collections of documents. Cultural heritage institutions such as national libraries are perceived as trusted third parties capable of creating rationally-constructed and well-documented collections, but such archives raise certain ethical and methodological questions.

To find source material on the web, some researchers look for non-traditional sources, such as blogs and social networks.  Researchers recognize the value of web archives, especially because websites disappear or change quickly.  The Internet is no longer just a place for publishing things, “but rather the traces left by actions that people could equally perform in the streets or in a shop: talking to people, walking, buying things... It can seem improper to some to archive anything relating to this kind of individual activity. On the other hand, one of the researchers acknowledges that archiving this material would provide a rich source for research in the future, and thus compares archiving it to archaeology.”  Some ask, "How do you archive the flow of time?" New models may be needed. And when selecting an archive, the selection criteria should also be archived, as they may change over time.

The secrets of Digitalkoot: Lessons learned crowdsourcing data entry to 50,000 people (for free).

The secrets of Digitalkoot: Lessons learned crowdsourcing data entry to 50,000 people (for free). Tommaso De Benetti.  Microtask. June 16, 2011.
National Library of Finland launched a project called Digitalkoot, which was a test of crowdsourcing with 50,000 volunteers.  The aim was to digitize the National Library’s archives and make them searchable over the internet. The volunteers input data that Optical Character Recognition (OCR) software struggles with (for example documents that are handwritten or printed in old fonts). Digitalkoot relies on machines, humans and a gaming twist.

Tuesday, April 03, 2012

Dream of perpetual access comes true!

Dream of perpetual access comes true! Jeffrey van der Hoeven. Open Planets Foundation. 
he KEEP project released its final version of the open source Emulation Framework software. This project has brought emulation in the digital preservation context to the next level, that is, user friendly.  The easy to install package runs on all major computer platforms.  It automates several steps:

  1. identify what kind of digital file you want to render;
  2. find the required software and computer platform you need;
  3. match the requirements with available software and emulators;
  4. install the emulator;
  5. configure the emulator and prepare software environment;
  6. inject the digital file you selected into the emulated environment;
  7. give you control over the emulated environment.
 The software supports six different computer platforms out of the box: x86, Commodore 64, Amiga, BBC Micro, Amstrad, Thomson, by using seven open source emulators which are distributed  with the Emulation Framework. 

With tech breakthrough, Seagate promises 60TB drives this decade

With tech breakthrough, Seagate promises 60TB drives this decade.  Lucas Mearian. ComputerworldMarch 20, 2012.
 Seagate said they have achieved a density of 1 terabit (1 trillion bits) per square inch on a disk drive platter. The technology would lead to the production this decade of 3.5-in. hard drives with up to 60TB of capacity.

As drive manufacturers store more bits per square inch on the surface of a disk, they also tighten the data tracks. The challenge as those tracks tighten is overcoming magnetic disruption between the bits of data, which causes bits to flip their magnetic poles resulting in data errors.