Wednesday, April 13, 2016

Beyond the Binary: Pre-Ingest Preservation of Metadata

Beyond the Binary: Pre-Ingest Preservation of Metadata. Jessica Moran, Jay Gattuso. iPres 2015. Nov. 2015.
     This paper describes some of the challenges the National Library of New Zealand has faced to maintain the authenticity of born digital collections (objects and metadata) from the time they are first received until they are ingested into their Rosetta digital preservation system. Two specific challenges relate to contextual metadata of filenames and file dates.

"The digital preservation analyst is responsible for technical assessment of digital content going into the digital preservation system, and troubleshooting digital content that fails validation checks". The digital archivists serve as archival and content subject matter experts; the digital preservation analyst is the subject matter expert for technical concerns. The two perspectives allow for robust workflows that better preserve the content.  They are especially interested in "file system metadata such as filenames and dates that are not embedded with the objects themselves, but rather are stored externally in the file system". Filename and date metadata have "challenged us to think critically about what constitutes acceptable, reversible, and recordable change and where and how this metadata should be stored for preservation and later for delivery to users".

Proper handling rules means that for digital preservation we need to treat files slightly more sensitively. We might want to know what the original file extension was as it is an important part of a file’s provenance.

Most born digital objects they receive have three dates: created date, last modified date, and last accessed date. They can be used to confirm an object is what it says it is. They have a practice of "touching the original file as little as possible and only as much as needed to get the file into the preservation environment".

One solution is the creation of forensic disk images as a first step in the transfer process. Another solution would be to create "a tool to help us automate the original and any subsequent transfers of born digital content, ensure the capture of original filename and date metadata and any preconditioning actions we performed, and at the same time create a log of that activity that is auditable and both human and machine readable."  They have been developing a script to accomplish what they need.

Their ongoing questions concern the delivery of objects from the digital preservation system should include proof of the integrity and authenticity of the binary object through delivery of the associated metadata.

No comments: