Thursday, December 08, 2011

Why don't we already have an Integrated Framework for the Publication and Preservation of all Data Products?

Why don't we already have an Integrated Framework for the Publication and Preservation of all Data Products?   Alberto Accomazzi,et al. Astronomical Data Analysis Software and Systems.  
7 Dec 2011.
Astronomy has long had a working network of archives supporting the curation of publications and data. There are examples of websites giving access to data sets, but they are sometimes short lived.  "We can only realistically take implicit promises of long-term data archival as what they are: well-intentioned plans which are contingent on a number of factors, some of which are out of our control." We should take steps to ensure that our system of archiving, sharing and linking resources is as resilient as it can be.  Some ideas are: 
  1. future-proof the naming system: assign persistent data IDs to items we want to preserve 
  2. provide the ability to cite complete datasets, just as we can cite websites
  3. include a data reference section in academic papers
Curated datasets need to be preserved indefinitely for scholarly purposes.

A literature review: What exactly should we preserve? How scholars address this question and where is the gap

A literature review: What exactly should we preserve? How scholars address this question and where is the gap.  Jyue Tyan Low. Cornell University Library. 7 Dec 2011.
There are generally two approaches to long-term preservation of digital materials
  1. preserving the object in its original form as much as possible along with the accompanying systems,
  2. migration or transformation: transforming the object to make it compatible with more current systems but retaining the original “look and feel.
Migration is the most widely used method, but there can be changes to the original.  If some of the original properties are lost, what then are the essential properties to maintaining its integrity?  Currently there are no formal and objective way to help stakeholders decide what the significant properties of the objects are, which are defined as:
The characteristics of digital objects that must be preserved over time in
order to ensure the continued accessibility, usability, and meaning of the
objects, and their capacity to be accepted as evidence of what they purport
to record.
An important goal of digital preservation is more than just retrieving the objects, it is to ensure the authenticity of the information.  A digital object can change as long as the final output is what it is expected to be.  The properties to preserve come from the purpose of the object, and at least one purpose for the object needs to be defined. Archivists have created standards that look at records in the context of their creation, intended use and preservation.  It is important to ask what features of the object is important when delivering to the user.  There may be many uses to many communities that were not intended by the object creator, so we should not let the ideal limit the reasonable.