Friday, June 30, 2006

Review of the OAIS Reference Model

Announcement of a Comment Period for the Five Year Review of the Reference Model for an Open Archival Information System (OAIS) Standard

OAIS Overview

The “Reference Model for an Open Archival Information System (OAIS)” was developed for use in facilitating a broad, discipline independent, consensus on the requirements for an archive or repository to provide permanent, or indefinite long-term, preservation of digital information. It was also intended to support the development of additional digital preservation standards and to encourage digital preservation support by vendors.

An OAIS is an archive consisting of an organization of people and systems that has accepted the responsibility to preserve information and make it available to a designated community. The standard defines a set of responsibilities that an OAIS archive must fulfill and this allows an OAIS archive to be distinguished from other uses of the term ‘archive’.

Since its adoption as both Consultative Committee for Space Data Systems (CCSDS) and ISO standards, the OAIS Reference Model has been welcomed and widely adopted by virtually all types of digital preservation communities. Most modern digital archiving initiatives reference the OAIS Reference Model standard. It has also been widely used by organizations to inform their implementations of new or upgraded archiving systems.

Five Year Review

In compliance with ISO and CCSDS procedures, a standard must be reviewed every five years and a determination made to reaffirm, modify, or withdraw the existing standard. The “Reference Model for an Open Archival Information System (OAIS)” standard was approved as CCSDS 650.0-B-1 in January 2002 and was approved as ISO standard 14721 in 2003. While the standard can be reaffirmed given its wide usage, it may also be appropriate to begin a revision process. Our view is that any revision must remain backward compatible with regard to major terminology and concepts. Further, we do not plan to expand the general level of detail. A particular interest is to reduce ambiguities and to fill in any missing or weak concepts. To this end, a comment period has been established.

Comment Process

We are soliciting recommendations for updates that will reduce ambiguities or improve missing or weak concepts. We also want to know if you prefer that no changes be made. Please categorized your comments for changes under one of the following:

• Updates needed for clarification
• Updates to add missing concepts or strengthen weak concepts
• Identification of any outdated material

Please be as specific as possible with your suggestions. For this consideration, comments must be received by 30 October, 2006.

Comments may be submitted to:

Should the decision be taken to update the OAIS Reference Model, there should be an opportunity to participate in the process. Please also express your interest in participating and, if known, an indication of your expected level of effort.

Weekly Readings - 30 June 2006

The National Archives and the San Diego Supercomputer Center Sign Landmark Agreement to Preserve Critical Data. Press release. U.S. Newswire. June 28, 2006.

The National Archives and Records Administration (NARA) and the San Diego Supercomputer Center (SDSC), with the National Science Foundation, signed an agreement that provides a way to preserve valuable digital data collections. "Preserving our most valuable digital assets is critical for leadership and competitiveness in research and education.” The agreement will allow SDSC to “expand and formalize its role as a national data repository, and provide a venue for the preservation of valued digital collections from federally sponsored research.”

EMC Plans to Bring Out Data Classification System. Bradley Mitchell. Computerworld. June 26, 2006.

EMC announced plans to ship data classification software later this year that will allow storage administrators set up policies automatically categorize data by its importance and store it using a hierarchical storage management system. Critical data will be more accessible and on faster devices; less frequently used data will be store on slower and less expensive devices. The system will initially work with unstructured information, but will eventually support databases as well, a company executive said last week.

PowerFile Introduces Permanent Storage Appliance. Computer Technology Review. June 19, 2006.

PowerFile, Inc., which develops archive appliances for permanent storage of digital content and assets, has introduced the Permanent Storage Appliance. The appliance is a network-attached storage system that uses a patented, highly scalable DVD-based subsystem with the capability to store files online for many years.

Toshiba to Launch World's First HD DVD Recorder. Kiyoshi Takenaka. eWeek. June 22, 2006.

Toshiba announced that in July it will launch its new high-definition optical disc recorder based on the HD DVD format. It will come with a 1TB hard drive. They will not be available outside of Japan at this time.

Sun, Microsoft answer Mass. call for ODF/Office converter. Eric Lai. Computerworld. June 29, 2006.

Several companies, including Sun and Microsoft, have responded to a call for software plug-ins to would allow Microsoft Office users to read and write files in the OpenDocument format (ODF). ODF is a free XML file international standard which is based on the open-source OpenOffice suite. Microsoft is supplying technical documents and intellectual property rights to third-party developers working on such a plug-in.

Wednesday, June 28, 2006

Weekly Readings - 23 June 2006

Foundations for a Successful Digital Preservation Program: Discussions from Digital Preservation in State Government: Best Practices Exchange 2006. Christy E. Allen. RLG DigiNews. June 15, 2006.

The State Library of North Carolina hosted Digital Preservation in State Government: Best Practices Exchange 2006. The sessions focused on nine aspects ofdigital preservation:
identification, selection, and appraisal of digital assets; repository systems; collection of digital assets; authentication; resources and workflows; access; metadata; preservation; and organization.
A successful digital preservation program requires a strong foundation which allows it to remain stable in a highly dynamic environment. Four essential elements for building a strong digital preservation program within any institutional framework:
  1. support and buy-in from stakeholders;
  2. “good enough” practices implemented now;
  3. collaborations and partnerships; and
  4. documentation for policies, procedures, and standards.
Education is important. Managerial commitment and IT support is critical. If you think digital preservation is expensive, look at the cost of not doing it. Good practices are important. We must work with the tools available to manage and preserve digital information as best we can. “Good enough” does not mean sub-standard; it means using the best solutions available today, knowing they aren’t perfect, but that they will evolve. Solutions must be flexible, modular, and interoperable. Digital preservation programs are most effective in partnerships. Policies should document the value of digital preservation within the institution and be included in the institution’s goals and objectives. Research all options before establishing standards, then document practices clearly; distribute the standards to partners.

Toshiba unveils HD-DVD recorder with 1 TB hard drive space. Humphrey Cheung. TG Daily. June 22, 2006.

Toshiba announced the RDD-A1 HD-DVD record with a one Terabyte worth of hard drive space. Up to 130 hours of high-definition video can be stored and up to 230 minutes of that content can be burned to a single HD DVD disc. will start selling in Japan on 14 July for around $3500.

Google announces U.S. government search site. China Martens. Computerworld. June 15, 2006.

Google has unveiled a search site to make it easier to locate U.S. government information. The site allows users to search content contained on U.S. federal, state and local government Web, particularly websites with the .gov and .mil domains, in addition to some government sites with the .com, .us and .edu domains. This may be expanded to other countries.

MRI systems: from life saver to disk killer. Robert L. Mitchell. Computerworld. June 21, 2006.

Georgia Tech has created a system which uses of a magnetic resonance imaging (MRI) magnetic field to remove every trace of data off a disk. The prototype was developed for military use, but they hope to create a unit that companies can use to remove sensitive data. In most cases, current disk erasure tools work fine if used properly.

Thursday, June 22, 2006

Weekly Readings - 16 June 2006

National Archives Sidesteps Obsolescence. Image & Data Manager. June 7, 2006.
The Australian National Archives completed its digital document management tools. The application, called Xena, is an application that converts digital documents into two XML-based formats. This is an attempt to make sure that documents can be read in future. The two conversion formats are:
  • - Bitstream: a metadata-wrapped bitstream version of the record, which is a secure original copy of the record. It contains the original information, but requires access to the original hardware, operating system and application software for performance.
  • - Normalized version: the converted record in XML. This is not considered to be an original copy of the record as some information may be lost in the conversion.
Other applications include Quest, which creates and maintains links to other objects in the repository and their metadata; and Digital Preservation Recorder which captures an audit trail of the digital object through the processes of
  1. Transfer: Collection of information about the context of the data object, including the objects that were sent to the archives in the same transfer.
  2. Quarantine: Collection of information about the media, what virus definitions checked, and its checksum.
  3. Preservation: Recording conversion details of the bitstream and normalized archival data formats.
  4. Repository: Continued integrity checks and information about access requests for records held in the repository.

Preservation's Crumbling Future. Maria Blackburn. The Johns Hopkins Magazine. June 2006.

A large fallacy is that digital will take the place of books, but the number of books published every year is increasing. Paper books will always exist. And as more look to digital materials, traditional paper preservation efforts are being overlooked. "People misunderstood the limits of digital as a form of preservation." Many assume that digital is a way to preserve books, instead of a way to reformat content for access. "All preservation has to be balanced in terms of use." Efforts should be targeted at the materials that are of the most value to the users.

Digital cinematography. Scott Kirsner. Hollywood Reporter. June 13, 2006.

Producers, directors and cinematographers have reservations about digital cinematography, including the question of the long-term preservation of digital files. "With a negative, you have the security of something that is stored on film. What we don't know -- and won't know for some time -- is how well the digital masters survive."

Why keep downloads on the down-low? Victor Keegan. Guardian Unlimited. June 13, 2006.
Digital rights management (DRM) is a problem that threatens to tie up valuable assets and make them inaccessible to the public. The British Library is “deeply worried about the way restrictive digital rights contracts are being imposed”. There are materials in the library that should be made public, “yet less than one percent of its priceless archive has been digitised because of potential conflicts about digital rights and preservation.” The business models that have been used in the past, particularly with music, no longer make sense in the digital age.

PKI Implementation at the University of Wisconsin–Madison. Nicholas Davis. EDUCAUSE Live! June 1, 2006. [PowerPoint] Audio portion.

An audio discussion of how the University of Wisconsin–Madison researched various PKI system models and established a method to issue digital certificates to encrypt and digitally sign e-mail and other sensitive information, and authenticate online identities. Also discusses topics such as comparing whether to build or buy a PKI system, integrating with other systems, success factors and lessons learned. Users like the ease of the system, and the ability to digitally sign emails. The system should be scalable. It is important to listen to the customers and find out what they want, rather than tell them what they should want. Many want to build their own PKI, but vendors, such as GeoTrust, can provide the same benefits. Administrators want to know the extended costs. Using a vendor can give lower upfront costs and a faster implementation. If the costs increase, they may choose to bring the implementation in house. There is enough to do that we don’t have to reinvent the wheel; don’t duplicate efforts, do new things. Keep it simple for users. Motivate the users, not obligate them. What matters most is what you do with PKI after a certificate is issued.

Friday, June 09, 2006

Weekly Readings - 9 June 2006

Networking for Digital Preservation - Current Practice in 15 National Libraries. Ingeborg Verheul. IFLA. 2006.

This is a 269 page book that describes the state of digital preservation in national libraries. Two national libraries have operational digital repositories and several others are developing them. It looks at planning digital preservation activities and issues. It starts with some standard definitions: Digital preservation is the general term for maintenance and care of digital or electronic objects. Long-term is 5 years or more. Digital archiving is the process of backing up and maintaining digital objects with the needed software and hardware. Permanent access indicates that preservation is only half the problem, and is one of the greatest challenges. Preservation strategies, like migration or emulations, are methods or techniques to keep the objects permanently accessible. Born digital means objects which are not intended to have an analogue equivalent.

Digital preservation will have to be a part of the library’s normal workflow activities in the future, and usually involves a number of people. Digital preservation always implies cooperative activity within the library, and IT always has a contribution, usually technical responsibility for the repository. The repository is built to retain digital objects in perpetuity, in a structure, scalable, environment. This may be done in a phased approach over time. There are both archiving and access services, and workflows must be as automated as possible. DSpace and Fedora are among the most standard systems, but there is a hesitancy about choosing one system over another. The general solution is not being sought in one single system; most expect to use a combination of systems. The most valuable aspect of the OAIS system is that it provides a shared vocabulary. The libraries currently accept all file formats, but the most common are: TIFF, PDF, XML, HTML, and WAV.

The complexity and volume of digital objects are growing, so there needs to be an emphasis on developing selection methods. Every stored object must have structured metadata, which must contain details on format, structure, and use of the content; history of actions performed; authenticity information and custody history; and rights information. Depositors should submit metadata with the object. Access is often provided through the library catalogue, which is usually separate but linked. Strategies include migration, emulation, and bit level preservation. Digital preservation activities usually start as a project, and as that is finished, all departments get involved. New working structures are being set up to make it as smooth as possible. The distinction between library material and archiving material is fading. Currently the only accepted standard is OAIS. There are also overviews of each national library and the organizational charts.

Microsoft rings last bell for Windows 98, ME. Jeremy Kirk. Computerworld. June 09, 2006.

Support and security updates for Windows 98, Windows 98 Second Edition and Windows Millennium Edition will end on July 11. Microsoft warned that customers face security risks if they use these after it ends support for them next month. Support for Windows XP Service Pack 1 will end on Oct. 10. More information on Microsoft Support Lifecycle is available at:

Taming the Digital Beast. Andy Patrizio. Campus Technology. June 6, 2006.

Schools are moving to digital media as a means of archiving and accessing their information and putting it in repositories. In order to get faculty and students to participate, you need to convince them that putting collections on line is part of scholarship and it is of value to others. Getting faculty to allow their publications in an institutional repository is not easy, because publishers have them convinced not to do it. So you must build trust. Repositories are to share knowledge that others may not be aware of. Most institutions are using only public domain or non-copyrighted information. Proper planning and the use of metadata will make the repository easily searchable as it grows. Librarians are information experts, but they’re not database administrators, so it is important to have technology experts as well. “A repository should have the capability to grow constantly, with only one maintenance concern: ‘We’re running out of storage.” “If you’re not giving it that organization and metadata capability, it’s just a pile of junk.”

DSpace is in use at 138 universities and institutions worldwide, including at Rice. It is becoming a more mature platform. It is also designed with digital media and structured for an academic environment. It does need programmers to support it, which can be costly. Access and submission can be controlled in a number of ways and levels. The choices of software are not as important as the data formats stored. Use open formats and lots of metadata. Acquisitions decisions can be a big task. Some institutions use lifecycle - off line storage, but their faculty want the data easily available.

Name That Tune. Campus Technology. June 9, 2006.

UC Berkeley has joined a growing group of schools who distribute video and audio recordings of course lectures and other content through Apple’s iTunes Music Store. “Berkeley on iTunes U” is open to the general public, at

DRM causing difficulties for libraries. Amber Maitland. Pocket-lint. 07 June 2006.

The British Library's Chief Executive, Lynne Brindley, warms that Digital Rights management DRM systems are creating unintended consequences that affect how digital material can be stored and disseminated by libraries, which have traditionally been protected by special exceptions under IP law. The digital material usually has a contract that is almost always more restrictive than existing copyright law. It frequently prevents copying, archiving, and access by the visually impaired. Of a small sample of 30 licenses offered to the library, only two allowed privileges available in fair use materials. Only two allowed archiving of the material. None of the licenses permitted copying of the whole work for the visually impaired. If this isn’t resolved, it could affect institutions who have traditionally held archival copies of material. As digital archiving methods becomes obsolete, DRM could prevent the library from transferring material to another preservation media. The library also recommends that IP law clarify that fair use applies to digital as well as print items.

Friday, June 02, 2006

Weekly Readings - 2 June 2006

Scholarly Publishing Practice: Academic journal publishers’ policies and practices in online publishing. John Cox, Laura Cox. Press Release. 2 June 2006.
This online study has a number of interesting findings, including:
- More journals are now available online: 90% of journals, compared with 75% in 2003.
- The number of journals continues to grow. 174 publishers have launched 1,048 new journal titles from 2000 to 2005, , while discontinuing 185 titles.
- The availability of back issues online has increased to 91% in 2005.
- Access to journal back volumes is becoming an integral part of the online product; 63% of publishers provide active subscribers with access at no extra cost.
- About a fifth of publishers are experimenting with open access journals.
- All categories of publishers are now extending usage rights to be ‘library friendly’.

Library Newspaper Cooperative. Managing Information. 30 May 2006.
NewsArchivePlus announced an initiative to provide online access to every historic (18th, 19th & early 20th Century UK national & regional) newspaper and library archive in the UK. They are asking members to provide digitized content that can be made available to all. Organizations can scan, archive and retrieve their material online which will increase the content available to them.

Microsoft shows off JPEG rival. Joris Evers. CNET News. May 24, 2006.
Microsoft intends to replace the JPEG image format with Windows Media Photo. The new image format will be made available with Vista, and will also be available for XP. Managing "digital memories" is one of the key features of Vista. The format should offer better pictures in half the size of JPEG images. The compression technology will make it possible to take part of a large image to display a smaller version. Images can be rotated without decoding and encoding. There is a different approach to color space and compression. Success will depend on how many use it, and Microsoft has not addressed the licensing issues.

Samsung Readies Hybrid Hard Drive. Martyn Williams. PC World. May 18, 2006.
Samsung plans to sell a hybrid hard drive which includes flash memory storage as well as a regular hard disk. The flash memory acts as a storage buffer to hold data until it is written to disk, which should extend disk life and increase performance.

Why OpenURL? Ann Apps and Ross MacIntyre. D-Lib Magazine. May 2006.
This article looks at the evolution of linking technologies, especially OpenURL, which is now a NISO standard. It looks at how OPENURL works, who will benefit from it, the current status or it and what is missing, and ways to use it. It has become a part of electronic publishing services provided by both libraries and publishers. It can provide access to more than just the full text article.