Monday, June 27, 2016

A Digital Dark Now? : Digital Information Loss at Three Archives in Sweden

A Digital Dark Now? : Digital Information Loss at Three Archives in Sweden.  Anna-Maria Underhill and  Arrick Underhill. Master’s  thesis. Lund University. 2016. [PDF]
     The purpose of this study is to examine the loss of digital information at three Swedish archives. Digital preservation is a complex issue that most archival institutions struggle with. Focusing on successes to the exclusion of failures runs the risk of creating a blind spot for existing problems.  The definition of digital information in this study includes digital objects and their metadata. The study includes digital internal work documents that serve as a contextual support for an archive’s collections; results are analyzed from the transition between the Records Lifecycle Model and the Records Continuum Model, an ontological understanding of digital information, the SPOT model for risk assessment and the OAIS Reference Model.

Some of the conclusions re-affirm previous research, such as the need to prioritize organizational issues. Others look at the current state of digital preservation at these archives which includes the delicate balancing act "between setting up systems for successful future digital preservation while managing existing digital collections which may not have been preserved correctly". Some institutions are unable to undertake a more proactive form of digital preservation because of the nature of the materials they preserve. The study points out that "when discussing digital preservation, the tendency remains to think of digitized material first rather than born digital information". The loss of a file may be only a part of the loss; there is also a loss of metadata and the connections between information, which may be more common than the loss of entire digital objects. "Finally, one question has followed this study from the beginning to the end: How can you know that you have lost something you never knew existed".
  • When discussing digital  preservation, it is important to clarify that storage is not the same  thing  as  preservation. 
  • The  survival  of  information  is  dependent upon the maintenance of its infrastructure  and migrating it to contemporary  formats. 
  • Authenticity can be a major issue for digital records and is important to their evidentiality.
  • Emulation is another option for digital preservation, which targets the operating environment of the information rather than the file. 
  • Emulation will eventually require migration. Emulation can become too complicated to be viable in the long run
  • Sometimes digital preservation fails to preserve what it intends to save, which can be termed information loss.
  • Obsolescence is currently one of the greatest threats to successful digital preservation. If a file cannot be read, then it is nearly the same thing as a document having been destroyed. 
  • "Without the provenance and the contextual links between records, records cannot be demonstrated to be authentic and reliable, evidentiality is lost and the use of the records for knowledge and understanding about what has happened will be difficult."

One definition of short, medium, and long-term preservation is:
  • Short-term preservation – solutions that are used for a short time, 5 years maximum.
  • Medium-term preservation – solutions that are used during a system’s lifetime, 10 years maximum.
  • Long-term preservation – solutions that are used after the originating system’s lifetime, the number of years varies, usually from 10 to 50 years.
"Dark archives are often used in order to separate the original master copies of a file from the copies that users actually access. These dark archives are generally only accessed when new material is being placed in them, and are otherwise protected in order to maintain the authenticity of the originals by placing them in an environment that is as tamper and error proof as possible"

Six essential properties for digital preservation which must be preserved:
  • Availability
  • Identity
  • Persistence
  • Renderability
  • Understandability
  • Authenticity
The study showed types of actual and potential information loss:
  • Loss of parts or whole digital objects during migration
  • Loss of the connections between analog and digital information belonging to the same archive
  • Loss of information due to it having been saved in an incorrect format
  • Loss of data in connection with technological changes
  • Loss of digital information when stored together with analog
  • Loss of information due to obsolete hardware
  • Loss of metadata due to databases written in code that is not open source
The reasons behind such actual and potential information loss were:
  • Human error during the production of information
  • An analog understanding and treatment of digital information
  • A lack of organizational structure and strategies for digital preservation
  • Lack of resources
  • Technological limitations
  • Lack of competencies among staff who produce digital information

No comments: