Friday, July 17, 2015

Filling the Digital Preservation Gap. A Jisc Research Data Spring project. Phase One report - July 2015

Filling the Digital Preservation Gap. A Jisc Research Data Spring project. Phase One report - July 2015. Jenny Mitcham, et al. Jisc Report. 14 July 2015.
     Research data is a valuable institutional asset and should be treated accordingly. This data is often unique and irreplaceable. It needs to be kept to validate or verify conclusions recorded in publications. Preservation of the data in a usable form may be required by the research funders, publishers, or  universities. The research data should be preserved  and available for others to consult  after the project that generated it is complete.This means the research data needs to be actively managed and curated. "Digital preservation is not just about implementing a good archival storage system or ‘preserving the bits’ it is about working within the framework set out by international standards (for example the Open Archival Information System) and taking steps to increase the chances of enabling meaningful re-use in the future."

Accessing research data is clearly already a problem for researchers when formats and media become obsolete. A 2013 survey showed that 25% of respondents had encountered the “Inability to read files in old software formats on old media or because of expired software licences”. A digital preservation program should address these issues. Open Archival Information System and it uses standards such as PREMIS and METS to store metadata about the objects that are being preserved.  A digital preservation system, such as Archivematica recommended in the report, would consist of a variety of different systems performing different functions within the workflow. "Archivematica should not be seen as a magic bullet. It does not guarantee that data will be preserved in a re-usable state into the future. It can only be as good as digital preservation theory and practice is currently and digital preservation itself is not a fully solved problem."

Research data is particularly challenging from a preservation point of view because of the many data types and formats, many of which are not formats that digital preservation tools and policies exist for, thus they will not receive as a high a level of curation when ingested into Archivematica.
The rights metadata within Archivematica may not fit the granularity that would be required for research data. This information would need to be held elsewhere within the infrastructure.

The value of research data can be subjective and difficult to assess and there may be disagreement on the value of the data. However, the bottom line is "in order to comply with funder mandates, publisher requirements and institutional policies, some data will need to be retained even if the researchers do not believe anyone will ever consult it." Knowing the types of formats used is a key to digital archiving and planning, and without that there will be problems later. In the OAIS Reference Model, information about file formats needs to be part of the ‘Representation Information’ that an end user must have to open and view a file.

No comments: