Thursday, November 13, 2008

Digital Preservation Matters - 14 November 2008

Library of Congress Digital Preservation Newsletter. Library of Congress. November 2008.

There are three interesting items in the November newsletter:

1. The NDIIPP Preserving Digital Public Television Project is building infrastructure, creating standards and obtaining resources. The project is trying to create a consistent approach to digital curation among those who produce PBS programs. Their metadata schema includes four elements: PBCore (a standard developed by and for public media organizations), METS rights, MODS and PREMIS. The goal is to put the content in the Library’s National Audio-Visual Conservation Center where it will be preserved on servers and data tapes. This will support digital archiving and access for public television and radio programs in the US. Many stations are unsure about what to do with their programs for the long term and the American Archive is seen as a solution.

2. Digitization Guidelines: An audiovisual working group will set standards and guidelines for digitizing audiovisual materials. The guidelines will cover criteria such as evaluating image characteristics and establishing metadata elements. The recommendations will be posted on two Web sites:

3. Data Archive Technology Alliance: A meeting was held to establish a network of data archives to help develop shared technologies for the future. They hope to set standards for shared, open-source and community developed technologies for data curation, preservation, and data sharing. It is critical to clearly define the purpose and outcome of the effort. Those involved will develop a shared inventory of their tools, services, and also list new developments to enhance data stewardship.

JHOVE2 project underway. Stephen Abrams. Email. November 6, 2008.

The JHOVE tool has been an important part of digital repository and preservation workflows. It has a number of limitations and a group is starting a two-year project to develop a next-generation JHOVE2 architecture for format-aware characterization. Among the enhancements planned for JHOVE2 are:

· Support for: signature-based identification, extraction, validation, and rules-based assessment

· A data model supporting complex multi-file objects and arbitrarily-nested container objects

· Streamlined APIs for integrating JHOVE2 in systems, services, and workflows

· Increased performance

· Standardized error handling

· A generic plug-in mechanism supporting stateful multi-module processing;

· Availability under the BSD open source license

Planetarium - Planets Newsletter Issue 5. 22 October 2008 [PDF]

The newsletter includes several items about Planets (Preservation and Long-term Access through Networked Services) which is a European to address digital preservation challenges. Here are a few items from the newsletter: Project Planets will provide the technology component of The British Library digital preservation solution.

The preservation planning tool Plato implements the PLANETS Preservation Planning approach. It looks and guides users through four steps:

  1. define context and requirements;
  2. select potential actions and evaluate them on sample content;
  3. analyze outcomes and;
  4. define a preservation plan based on this empirical evidence.

Digital preservation activities can only succeed if they consider the wider strategy, policy, goals, and constraints of the institution that undertakes them. For digital preservation solutions to succeed it is essential to go beyond the technical properties of the digital objects to be preserved, and to understand the and institutional framework in which data, documents and records are preserved. The biggest barriers to preservation are:

  1. lack of expertise
  2. funding and
  3. buy-in at senior level.

Cisco unveils a router for the 'Zettabyte Era'. Matt Hamblen. Computerworld. November 11, 2008.

Cisco introduced the "Zettabyte Era," and announced the Aggregation Services Router (ASR) 9000, the next generation of extreme networking. They believe service providers need to prepare for petabytes or even exabytes data from video applications which need faster routing. “Instead of needing switching for petabytes or even exabytes of data, the zettabyte will soon be the preferred term, equal to 10 to the power of 18”.

In praise of ... preserving digital memories. Editorial. The Guardian. September 30, 2008.

Some people are thinking centuries ahead. The British Library hosted the iPres conference to work out ways to preserve data for future generations. Since most everything is in digital form now, this is a difficult thing to do. By 2011 “it is expected that half of all content created online will fall by the wayside.” There is no Rosetta Stone for digital but progress is being made.

Skills, Role & Career Structure of Data Scientists & Curators: Assessment of Current Practice & Future Needs. Alma Swan, Sheridan Brown. JISC. 31 July 2008.

The report of a study that looks at those who work with data

It identifies four roles, which may overlap

  • Data Creator: Researchers who produce and are experts in handling, manipulating and using data
  • Data Scientist: Those who work where the research is carried out and may be involved in creative enquiry and analysis
  • Data Manager: Those who take responsibility for computing facilities, storage, continuing access and preservation of data
  • Data Librarian: Librarians trained and specializing in the curation, preservation and archiving of data

There is a continuing challenge to make sure people have the skills needed. Three main potential roles for the library:

  1. Training researchers to be more data-aware
  2. Adopt a data archiving and preservation role; provide services through institutional repositories
  3. Training of data librarians

Caring for the data frees data scientists from the task and allows them to focus on other priorities. Data issues are moving so fast that periodic updating is much more effective than an early, intensive training with no follow-up. Some institutions offer training courses and workshops on data-related topics.

No comments: