Friday, February 10, 2006

Weekly readings - 10 February 2006

AIHT: Conceptual Issues from Practical Tests. Clay Shirky. D-Lib Magazine. December 2005.

The Archive Ingest and Handling Test (AIHT) is a project of the National Digital Information Infrastructure and Preservation Program (NDIIPP), sponsored by the Library of Congress. The idea is that by giving a complete digital archive to a variety of participants, we can better understand which aspects of digital preservation are general and which are institution-specific. It was also to:

· test and assess the feasibility of transferring digital archives from one institution to another,
· document useful practices,
· discover which parts of the handling of digital material can be automated, and
· identify areas that require further research or development.

The George Mason University's collection of 9/11 materials was selected. Some of the lessons learned are:

· In ingest, the identifiers must be independently verified, such as determining whether files labeled .jpg are really in JPEG format.
· The desirability of a digital object will be more closely related to the value of its contents than to the quality of its metadata.
· Donors are likely to force preserving institutions to choose between accepting imperfectly described data or receiving no data at all.
· Even a small percentage of exceptions in operations on a large archive can create a large problem.
· Maintaining digital materials for a short term is relatively simple, but more difficult for the long term
· It is possible for a preserved bit stream become unusable because the complex the system changes over time: hardware, software, OS, etc.
· Multiple preservation strategies provide the best hedge against unforeseen systemic failure.
· Using a variety of strategies to preserve data may be a better strategy than putting all efforts toward one single preservation system.
· There is a need for continual comparative testing of preservation tools and technologies.


Harvard's Perspective on the Archive Ingest and Handling Test. Stephen Abrams, et al. D-Lib Magazine. December 2005.

Harvard has operated a preservation repository for over 5 years which contains over 3 million objects, and takes over 12 TB of space. The repository is intended for “highly ‘curated’ digital assets; that is, those that are owned and submitted by known users, created according to well-known workflows and meeting well-known technical specifications, and in a small set of approved formats.” They intend to eliminate the restrictions and make it an institutional repository. Some of the issues they will face in the future include:

· Automated extraction of technical metadata from digital objects
· Automated generation of Submission Information Packages (SIPs)
· Systematic preservation migrations
· Post-migration quality assurance testing
· Metadata models for capturing provenance information

The input method used the JHOVE program to validate file types and to generate some of the technical metadata. They also chose to migrate image files to JPEG 2000. The data model did not provide a way to capture provenance metadata, but they decided they would look at PREMIS for this purpose. The format migration process was fully automated once the appropriate specifications were developed. In general, the test showed that in spite of scaling problems, “digital content can be transferred without loss between institutions utilizing radically different preservation architectures and technologies.” The limiting factor in transferring large amounts of data appears to be the number of objects, rather than their individual or total size.


Research books its place in the library of the future. IST Results. 1 Feb 2006.

Digital preservation is one of the three major research areas of the European Digital Library. Audiovisual material is particularly vulnerable to being lost due mostly to technological obsolescence. In the PrestoSpace project, they found 60 different video formats, which increases the problems. This project, with Media Matters, has created a method of transferring the obsolete media into digital data. Other parts of the project include a database listing all the known characteristics of types and years of video tapes, and an algorithm for the restoration of video and audio materials.


Technology victim: Western Union sends its last telegram. Todd R. Weiss. Computerworld. February 03, 2006.

Western Union has delivered its last telegram messages. This means of communication began over 155 years ago but has been replaced by other means of communication. Over 200 million telegrams were delivered in 1929, and only 20,000 were delivered last year.


Interview as learning tool
… Michael Yunkin. digitize everything. February 3rd, 2006.

Some comments from an interview about digitization and preservation that are worth reading. Excerpts include: Digitization is not preservation, except possibly the source material is fragile, and using a digital surrogate can help avoid overuse. "The increased access that comes with digitization IS added value."

No comments: