Tuesday, January 19, 2010

Scholarly Journals Introduce New Data Archiving Policy

Scholarly Journals Introduce New Data Archiving Policy

An important editorial about data archiving has just appeared online in the February issue of The American Naturalist.

To promote the preservation and fuller use of data, The American Naturalist, Evolution, the Journal of Evolutionary Biology, Molecular Ecology, Heredity, and other key journals in evolution and ecology will soon introduce a new data archiving policy to ensure that data supporting published articles is preserved and made publicly available. The policy has been enacted by the Executive Councils of the societies owning or sponsoring the journals.

For example, the policy of The American Naturalist will state:

This journal requires, as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive, such as GenBank, TreeBASE, Dryad, or the Knowledge Network for Biocomplexity. Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future. Authors may elect to have the data publicly available at time of publication, or, if the technology of the archive allows, may opt to embargo access to the data for a period up to a year after publication. Exceptions may be granted at the discretion of the editor, especially for sensitive information such as human subject data or the location of endangered species.

This policy will be introduced approximately a year from now, after a period when authors are encouraged to voluntarily place their data in a public archive. Data that have an established standard repository, such as DNA sequences, should continue to be archived in the appropriate repository, such as GenBank. Data can also be archived in a more flexible, interdisciplinary digital data archive such as the National Science Foundation–sponsored Dryad repository, at http://datadryad.org/.

Dryad is developed by the US National Evolutionary Synthesis Center and the University of North Carolina Metadata Research Center, in collaboration with with a consortium of partner journals.

Authors of the editorial, Michael C. Whitlock, Mark A. McPeek, Mark D. Rausher, Loren Rieseberg, and Allen J. Moore present the case for the importance of data archiving in science. This is the first of several coordinated editorials soon to appear in major journals.

Friday, January 15, 2010

Digital Preservation Matters - January 2010

Insight into digital preservation of research output in Europe. Tom Kuipers, Jeffrey van der Hoeven. PARSE.Insight. January 14, 2010. [83p. PDF]

A report of surveys concerning digital preservation of digital research data and publications, data sharing, roles and responsibilities of stakeholders in research and funding of research. Some of the findings.

  • The possibility of re-analysis of existing data is the most important driver for preserving research data.
  • The most important threat to preserving digital information is the lack of sustainable hardware, software or support of computer environment that may make the information inaccessible.
  • Legal issues and fear of data being misused are the greatest barriers to sharing data.
  • Publishers believe the most important reason for preservation is that it will stimulate the advancement of science
  • The majority of publishers fear that data will not be sustainable when the current owner ceases
  • Most publishers do not have an arrangement in place to preserve underlying research data. They believe the author is responsible for this.


Scholarly Publishing Roundtable Report and Recommendations . Scholarly Publishing Roundtable. January 12, 2010. [31p. PDF] Direct link to the report.

The Scholarly Publishing Roundtable was convened to examine the current state of scholarly publishing and develop recommendations to expand public access to the journal articles that came from research funded by US government agencies. They developed a set of principles to use going forward:

  1. Peer review is critical to maintain high quality and editorial integrity.
  2. Adaptable business models are necessary.
  3. Scholarly and scientific publications can and should be more broadly accessible with improved functionality to a wider audience
  4. Sustained archiving and preservation are essential complements to reliable publishing methods.
  5. Research results need to be published and maintained in ways that maximize the possibilities for creative reuse and interoperation among sites that host them.

They provide a number of recommendations and conclude: “We urge publishers, librarians, universities, and scholars to consider these recommendations as creating an appropriate collaborative environment and putting an end to the previous decade of wrangling over access issues. All can then focus efforts on interoperability, reuse, and preservation with the argument that those features of the whole system strongly support public access; on broad, intelligent use of the products of federally funded research; and on future advances in support of both scholarship and public access to its results.”


Library of Congress Digital Preservation Newsletter. Library of Congress. January 2010.

The newsletter contains information on:

  • Memento: Time Travel for the Web. A project looking at adding a time-based dimension to searching and browsing.
  • Voice of America and the expanding role of digital materials at the Library of Congress.


40th Anniversary of Apollo 11 Moonwalk and the loss of data. Richard L. Hess. July 17, 2009.

  • Vigilant migration of data as new storage techniques become available is the only way to assure long-term preservation.
  • We MUST be selective as to what we keep in our archives because if we keep everything we won’t be able to afford it–or find it. This is one of the key jobs that archivists do. However, blindly following retention practices, as was done by NASA for the IRIG Apollo 11 tapes, needs to be tempered by historians as well.
  • In a generation (or less) if we save everything, it will become an overwhelming burden and the high points will be lost if they are not properly indexed.


How to Choose a Digital Preservation Strategy: Evaluating a Preservation Planning Procedure.

Stephan Strodl, et al. Vienna University of Technology. 2007.

There is a variety of tools to support preservation strategies such as migration or emulation, which are the most prominent. But different preservation requirements across institutions make it very difficult to decide which to implement. The PLANETS approach offers a standardized way of planning and evaluating preservation strategies. This describes the workflow for evaluating and selecting digital preservation solutions following these principles. Preservation Planning has become a crucial decision process. It involves evaluating preservation strategies and tools, and choosing what is most appropriate. This is the “most difficult part in digital preservation endeavours”. The workflow includes:

  • Define the basis, the collection, types and numbers of records, legal issues, environment...
  • Choose the records that represent the variety of the collection
  • Identify the requirements, characteristics, costs, measurements, etc.
  • Define and describe alternatives, different solutions, tools, etc
  • Perform the experiment which includes multiple stages