Monday, March 30, 2015

Digital Preservation Challenges with an ETD Collection: A Case Study at Texas Tech University

Digital Preservation Challenges with an ETD Collection — A Case Study at Texas Tech University. Joy M. Perrina, Heidi M. Winkler, Le Yanga. The Journal of Academic Librarianship. January, 2015.
The potential risk of loss seems distant and theoretical until it actually happens. The "potential impact of that loss increases exponentially" for a university when the loss is part of the research output. This excellent article looks at a case study of the challenges one university library encountered with its electronic theses and dissertations (ETDs).  Many institutions have been changing from publishing paper theses and dissertations to accepting electronic copies. One of the challenges that has not received as much attention is that of preserving these electronic documents for the long term.  The electronic documents require more hands-on curation.

Texas Tech University encountered difficulties with preserving their ETD collection. They hope the lessons learned from these data losses will help other organizations looking to preserve ETDs and other types of digital files and collections. Some of the losses were:
  1. Loss of metadata edits. Corrupted database and corrupted IT backups required a rebuild of the database, but the entered metadata was lost.
  2. Loss of administrative metadata-embargo periods. The ETD-db files imported into DSpace did not include the embargoed files. Plans were not documented and personnel changed before the problem was discovered. Some items were found accidentally on a personal drive years later.
  3. Loss of scanned files. The scanning server was also the location to store files after scanning. Human error beyond the backup window resulted in the deletion of over a thousand scanned ETDs, which were eventually recovered.
  4. Failure of policies: loss of embargo statuses changes. The embargo statement recorded in the ETD management system did not match what was published in DSpace.
The library started on real digital preservation for the ETD collection. Funds were set aside to increase the storage of the archive space and provide a second copy of the archived files. A digital resources unit was created to handle the digital files which finally brought the entire digital workflow, from scanning to preservation, under one supervisor. The library joined DPN in hopes that it would yield a level of preservation far beyond what the university would be able to accomplish alone. The clean-up of the problems has been difficult and will take years to accomplish. Lessons learned:
  1. Systems designed for managing or publishing documents are not preservation solutions
  2. System backups are not reliable enough to act as a preservation copy. Institutions must make digital preservation plans beyond backups
  3. Organizations with valuable digital assets should invest in their items to store them outside of a display system only. 
  4. Multiple copies of digital items must reside on different servers in order to guarantee that files will not be accidentally deleted or lost through technical difficulties. 
  5. All metadata, including administrative data, should be preserved outside of the display system. The metadata is a crucial part of the digital item.
  6. Digital items are collections of files and metadata.
  7. Maintaining written procedures and documentation for all aspects of digital collections is vital.
  8. The success of digital preservation will require collaboration between curators and the IT people who maintain the software and hardware, and consistent terminology (e.g. archived).
 "Even though this case study has primarily been a description of local issues, the grander lessons gleaned from these crises are not specific to this institution. Librarians are learning and re-learning every day that digital collections cannot be managed in the same fashion as their physical counterparts. These digital collections require more active care over the course of their lifecycles and may require assistance from those outside the traditional library sphere...."

No comments: