Friday, April 02, 2010

Digital Preservation Matters - April 2, 2010

Avoiding a Digital Dark Age. Kurt D. Bollacker. American Scientist. March-April 2010.

Data longevity depends on both the storage medium and the ability to decipher the information

The general problem of data preservation is twofold. The first matter is preservation of the data itself: The physical media on which data are written must be preserved, and this media must continue to accurately hold the data that are entrusted to it. This problem is the same for analog and digital media, but unless we are careful, digital media can be more fragile.

The second part of the equation is the comprehensibility of the data. Even if the storage medium survives perfectly, it will be of no use unless we can read and understand the data on it. Unlike in the analog world, digital data representations do not inherently degrade gracefully, because digital encoding methods represent data as a string of binary digits (“bits”). Because any single piece of digital media tends to have a relatively short lifetime, we will have to make copies far more often than has been historically required of analog media. Like species in nature, a copy of data that is more easily “reproduced” before it dies makes the data more likely to survive.

In order to survive, digital data must be understandable by both the machine reading them and the software interpreting them. There are at least two effective approaches: choosing data representation technologies wisely and creating mechanisms to reach backward in time from the future.


A Survey of the Scholarly Journals Using Open Journal Systems. Brian D. Edgar, John Willinsky. Educause Resources. March 4, 2010. [40 p. PDF]

Open Journal Systems (OJS) is an open source, online journal management and publishing platform. This study looks at scholarly communications using the open source software systems. survey to which 998 editors or staff members responded. The results point to how these journals – largely independent, scholar-published titles with roughly half

originating in the developing world – are not otherwise represented. Of the survey, 40 percent published research in the sciences, technology and medicine, 30 percent were social science journals, and 11 percent were in the humanities. 19 percent of the journals in the study were interdisciplinary.

The number of journals using OJS has been growing at an average rate of 81% per year. And the number of new journals that are starting, are using OJS at a rate of 47%. About half the journals using OJS are born digital. OJS looks at the effect that open source tools can have on journal publishing, and adds to the case for rethinking scholarly communication.


Ensuring Perpetual Access: establishing a federated strategy on perpetual access and hosting of electronic resources for Germany. The Alliance of German Science Organisations. Final Report in English. March 30, 2010. [177p. PDF.]

Increasing digital content is a challenge for scientific institutions. This study is a basis for a national hosting strategy to “establish and finance sustainable structures for perpetual access as well as long-term preservation for electronic resources.” Research is critical to the economy. Large investments into the research need to be safeguarded and maintained. Any loss can impair research, and ensuring future access is an important challenge. One of the largest gaps is the “provision for perpetual access for e-journals.” Library access via hosting on publishers’ servers is not “sufficiently robust as a single perpetual access solution long-term,” though it may be the immediate approach. Independent perpetual access with partners is needed, such as Portico. There needs to be a “strategy to create an infrastructure for the storage and long-term preservation of digital documents, and which can guarantee perpetual access to licensed commercial publications and retro-digitised library materials.” PDF and XML with the NLM-DTD are becoming a metadata standard for published material.


Jhove2-0.6.0 Download. Website. March 19, 2010.

A new alpha release of JHOVE2 is now available for download and evaluation. Some features include:

  • Format identification, validation, feature extraction, and message digest.
  • Recursive processing of directories, file sets, etc.
  • Integration with DROID for file identification.
  • Results formatted as text and XML


