The article looks at a number of issues with digital preservation that the author feels are fallacies. They are:
1. Digital preservation is very expensive [because]
2. File formats become obsolete very rapidly [which means that]
3. Interventions must occur frequently, ensuring that continuing costs remain high.
4. Digital preservation repositories should have very long timescale aspirations,
5. 'Internet-age' expectations require the preserved object must be easily and instantly accessible in a useable format, and
6. the preserved object must be faithful to the original in all respects.
All preservation, including paper and book preservation, is expensive. Digital preservation as a whole compared to paper and book needs may actually be less. While consumer formats may go out of fashion, very rarely are any formats that are completely obsolete. Recovery of information from old files can be incomplete. Mass access to the internet has stabilized the formats. Part of the key to this is to share the information. This may be more of a problem with extended time frames. “Investment in digital preservation is important for cultural, scientific, government and commercial bodies. Investments are justified by balancing cost against risk; they are about taking bets on the future. The priorities in those bets should be: first, to make sure that important digital objects are retained with integrity, second to ensure that there is adequate metadata to know what these objects are, and how they must be accessed, and only third to undertake digital preservation interventions.”
It may not be necessary to look at digital preservation in hundreds or thousands of years. What institutions have this timescale? It may be more useful to look at digital preservation as a series of events or a relay. Make your decisions on the timescale that you can see and that you have the funding for. Preserve your objects to the best of your ability and hand them intact on to your successor. The right approach may be to keep the original bits and then produce access copies as you can. The high cost of accessing the original may be best given to the user who asks for them.
A restatement of the original issues would be:
1. Digital preservation is comparatively inexpensive, compared to preservation in the print world,
2. File formats become obsolete rather more slowly than we thought
3. Interventions can occur rather infrequently, keeping costs down.
4. Digital preservation repositories should adjust their timescale to meet their funding and business case, but should be prepared for their succession,
5. "Internet-age" expectations cannot be met by most digital repositories; and,
6. Only access versions of the preserved object need be easily and instantly accessible, although the original file and good preservation metadata should be available
The lack of money is the biggest obstacle to effective preservation. Poor decisions will reduce the amount of material that can be preserved. The right choice may be “fewer and better” or “cheaper and more”.
The DCC workshop goal was to provide insight about ensuring ongoing access to web sites over time. This is not just a matter of archiving, but also about how to design and manage a web site so that it is suitable for long-term preservation with minimum intervention. In one presentation, the key to this is the three R's -Reduce, Replicate and Redirect. Reduce the items to make them easier to preserve, replicate them in multiple formats, and redirect links to the new locations. It is more ‘future-improving’ rather than ‘future-proofing’. There need to be selection criteria and guidelines to collect and preserve web sites as part of an organization’s wider preservation strategy. Standards should be applied preferably at the point of creation rather than a later time. Persistent identifiers and important, but we should be looking at 15 – 20 years, not longer. Metadata should document the technical dependencies and tools; this is more useful than just descriptive metadata. The method of selecting web sites must also be documented.
Some record management principles require the documents to be saved but not necessarily the web site itself. An organization can therefore determine what needs to be saved, but it may not have to be the entire site. There should be a clear delineation of tasks and responsibilities. The National Library of Australia introduced PANDAS 3, a software tool for managing the process of gathering, archiving, and publishing web site resources. Authenticity is a key issue for web sites. Preservation management must include three key aspects: passive preservation; active preservation; and managing multiple manifestations. Permission should be obtained before archiving web sites. The main issues were:
· think about the records perspective;
· reduce, replicate and redirect;
· protect your domain;
· be archive-friendly;
· carry out 'not-bad practice';
· experiment, and;
· identify unhelpful practice.
This is an updated version of a decision tree, which is a tool to construct or test such a policy an organization. The questions and choices in the tree will assist with the decision to accept or reject long-term preservation responsibility. An effective policy must also be:
· Endorsed by senior management
· Actively circulated throughout the organization
· Reviewed regularly
· Accompanied by an appropriate resource commitment