Friday, November 30, 2007

Digital Preservation Matters - 30 November 2007

Council Conclusions on scientific information in the digital age: access, dissemination and preservation. The Council Of The European Union. November 2007.

The Council of the European Union presents some conclusions regarding digital preservation and recommendations during the next few years:

  • access to and dissemination of scientific information is crucial and can help accelerate innovation;
  • effective digital preservation of scientific information is fundamental for current and future development of research
  • it is important to ensure the long term preservation of scientific information, publications and data, and include scientific information in preservation strategies;
  • monitor good practices for open access to scientific information and development new models
  • experiment with open access to scientific data and publications to understand contractual needs
  • encourage research and experiments into digital preservation on deploying scientific data as widely as possible for open access to and preservation of scientific information.


Shifting Gears: Gearing Up to Get Into the Flow. Ricky Erwayr. OCLC. October 2007.

Efforts to digital special collections mean we need to re-look at what we are doing. Do we digitize for access or preservation, or both. How do our selection criteria affect the digitizing efforts. Access is important. We should preserve the unique items to the best of our ability, but it doesn’t mean we only have once chance to do it right. We may want to re-digitize when the technology improves. Scan items as part of the initial accessioning process; create a single unified process. Metadata can be improved as needed; it can be an iterative approach. Move to a program approach, not just special projects. It should be part of the regular budget. To do a better job we need to “integrate digitization into all workflows and user services”.


Digital library surpasses initial goal of 1 million books. International Herald Tribune. November 27, 2007.

The Universal Library project has surpassed its latest target, having scanned more than 1.5 million books. At least half the books are out of copyright or scanned with the permission of copyright holders. The library's mission is to make information freely available and to preserve rare and decaying texts. It is the largest university-based digital library of free books and its purpose is noncommercial. The library has books published in 20 languages, including 970,000 in Chinese, 360,000 in English, 50,000 in the southern Indian language of Telugu and 40,000 in Arabic.


Presentations from iPRES - 2007 International Conference on Preservation of Digital Objects. National Science Library . November 2007.

This site contains many pdf files of the presentations given at the October iPres conference in China. These are interesting to review. Some that I found particularly useful include:

  • Exploring and Charting the Digital Preservation Research Landscape, Seamus Ross
  • Chinese Digital Archival Network of Foreign STM Material, Xiaolin Zhang
  • A Practical Approach to Digital Preservation: Update from PLANETS, Helen Hockx-Yu
  • Challenges of Digital Preservation: Early Lessons from the Portico Archive, Eileen Fenton
  • Developing a CAS E-Journal Archiving System, Zhixiong Zhang
  • Comparative Evaluation of Major IR Systems for Preservation, Ting Zeng
  • New Partnerships for Scientific Data Preservation and Publication Systems, Zhongming Zhu


Towards the Australian Data Commons: A proposal for an Australian National Data Service. The ANDS Technical Working Group. October 2007.

This paper, among other topics, discusses the reasons to focus on data management, the issues, and the programs to deliver the data. While the paper looks specifically at a national data service, there are aspects that are useful for local digital preservation. Here are some interesting notes from it.

  • Important activities include identifying and deploying policies and technologies to allow users to gain seamless access to data collected within multiple institutionally operated repositories.
  • The intent is to provide common services to support research to make it easier to discover, access, use, analyze, and combine digital resources as part of their activities. They should also support and advise researchers and research data managers about appropriate digital preservation strategies.
  • We are in a data deluge. It can only continue and grow in intensity as the number, frequency and resolution of data sources rises and as information becomes universally ‘born digital’.
  • Data is an increasingly important and expensive ingredient of research activities and needs increasing attention to be managed efficiently and effectively.
  • The sponsors of data capture and care should help determine the accessibility of the data
  • Not everyone can use the same solution, so there may need to be multiple responses.
  • There should be a registry of repositories with services offered
  • Provide assistance to others on adopting the plans and getting the service they need.
  • Collecting and managing the metadata is critical. Best to collect early and automatically.

The data service believes it can contribute most effectively by developing services and activities that enable stewardship within multiple federations of data management and data user communities.

In ten years time, it will be successful if:

  • A data commons exists in a network of research repositories and the data is discoverable;
  • Researchers and data managers perform well with well formed data management policies;
  • More research data is routinely deposited into stable, accessible and sustainable environments;
  • More people have relevant expertise in data management


Stewardship of digital resources involves both preservation and curation. Preservation entails standards-based, active management practices that guide data throughout the research life cycle, as well as ensure the long-term usability of these digital resources. Curation involves ways of organizing, displaying, and repurposing preserved data.


Friday, November 16, 2007

Digital Preservation Matters - 16 November 2007

Electronic Records Management and Digital Preservation: Protecting the Knowledge Assets of the State Government Enterprise. Eric Sweden. NASCIO. October 2007. [pdf]

Electronic records management and digital preservation must be a shared responsibility, including understanding and support, from the CIO. Everyone needs to be part of managing digital assets. These initiatives must be managed on the organizational level. The team needs enterprise architects, project managers, electronic records managers, librarians and archivists to ensure the knowledge assets are managed properly. Technology create both opportunities and challenges. The goal of Digital Preservation systems is to make sure the information they contain remains accessible to users over a long period of time. A challenge is to keep bit streams intact and usable long term. You need to know what to preserve and how to preserve the records. The strategy must address preservation for the life of the record. There is not a single best way to preserve digital materials. Digital materials do not allow preservation procrastination. If a record needs to be maintained for over 10 years, the original technology will probably be obsolete. Digital Preservation must be a routine operation, not a special event.


RSA 2007: long-term data storage presents legal risks. Ian Grant. Computer Weekly. 23 Oct 2007.

Art Coviello, executive vice-president of EMC, stated at a conference that storing every piece of data long term may place organizations at risk of legal liability. The organization needs to know what data they have, who is looking at it and what they are doing with it. They should classify data and users before they store data. This is needed to protect the data and to reduce information clutter.


Keep 'Smoking Gun' E-Mails From Backfiring. H. Christopher Boehning, Daniel J. Toal. New York Law Journal. October 25, 2007

While this is written from a legal and not archival perspective, the article discusses the importance of validating / authenticating electronic documents. It lists the legal rules for authenticating emails and other electronic documents, including:

  • testimony by a witness with knowledge of the object;
  • circumstantial means ("appearance, contents, substance, internal patterns or other distinctive characteristics, taken in conjunction with circumstances," such as the email address;
  • hash values that serve as a digital fingerprint; comparison to existing documents;
  • self authentication of items with labels, tags, or ownership marks.


The Aftermath: Examining the E-Discovery Landscape After the 2006 Rule Changes. Eric Sinrod. FindLaw. October 16, 2007.

Another article emphasizing the importance of records management plans for electronic data. It mentions that “Data can be located live on networks, servers, hard drives, laptops, PDAs and on backup tapes.” Purging according to retention policies is important. Data may be required in ‘native’ format with all metadata intact.


‘Digital curators’ lead cultural IT projects. Shane Schick. ComputerWorld Canada. 8 Nov 2007.

As cultural organizations try to reach new audiences online and integrate their collections into multimedia-friendly exhibits, they are starting to face the same challenges as others who have been moving away from paper-based processes. These challenges include not only figuring how to digitize content but what gets preserved first, what can wait and what doesn’t need to be digitized at all. Institutions face the difficulty of trying to preserve something indefinitely, without knowing how formats might change over time. They must collecting the right hardware and software along with the content itself. “Archives are now building in budgets for migration strategies for data.”


Friendly Advice Machine. John Cleese. Iron Mountain. October 2007.

On the lighter side: For those with an interest in digital archiving and secure storage, and a ‘British’ sense of humor, these clips may be of interest.



Friday, November 09, 2007

Weekly Readings - 9 November 2007

HD Photo to become JPEG XR. Stephen Shankland. CNet News. November 2, 2007.
The Joint Photographic Experts Group has approved Microsoft's HD Photo format as a standard called JPEG XR. This is an important step to make the format neutral. It is designed for the next generation of digital cameras and was based on Microsoft’s Windows Media Format. Microsoft is committed to make the patents available without charge. The standardization process typically takes about a year. (See also http://www.jpeg.org/newsrel19.html).

PRONOM and DROID - new versions released. Neil Beagrie. National Archives UK. November 2, 2007.
The National Archives in the UK has released new versions of PRONOM and DROID. PRONOM is an online registry of file formats, software, and other technical information used for digital preservation purposes, available at http://www.nationalarchives.gov.uk/pronom. DROID (Digital Record Object Identification) is open source software at http://droid.sourceforge.net/ that is used to identify file formats in batch mode. They are freely available.

An overview of LOCKSS, how it works, and issues related to it. (LOCKSS, developed at Stanford University, stands for Lots of Copies Keeps Stuff Safe.) One of the main issues surrounding it is the issue of trust. “Trusting a single provider, a single institution, and a single archive represents the real risk”. LOCKSS is built on the principle of building confidence in the archive. LOCKSS was built to archive electronic journals but has been enhanced to also archive blogs on Google’s Blogger.

Looking Ahead. Lee J. Nelson. Advanced Imaging Magazine. November 9, 2007.
The article looks at some of the industry trends. Included is an announcement on an HD Photo Plug-in for Adobe Photoshop. “HD Photo is geared for end-to-end digital photography, offering better image quality, greater preservation of data and advanced features. Its still image codec for continuous-tone images is underpinned by lossy and lossless compression, multiple colorspaces, wide dynamic range and extensive metadata.”

Government Pledges £25m To Preserve Uk's Film Archives. 24 Hour Museum. October 17, 2007.
The British government has taken steps to preserve the country’s film archives. They have given money to the UK Film Council to secure the films in the archives. “It’s absolutely right that they should be safe and accessible for future generations.” The £25million plus £3million are to be used to preserve, restore and increase access to the collections, some of which are deteriorating and in danger of being lost.

The Library and Xerox are studying the potential of using the JPEG 2000 format in large repositories of digital materials. The project is designed to help develop guidelines and best practices for digital content. The trial will include up to 1 million tiff images to be converted to JPEG 2000. Xerox will build and test the system, and they look specifically to create profiles for the objects. Xerox already created a profile for using the JPEG 2000 format for newspapers.