Friday, July 21, 2017

ePADD 4.0 Released

ePADD 4.0 Final. July 21, 2017.
    This is the latest release of ePADD, a software tool "developed by Stanford University's Special Collections & University Archives that supports archival processes around the appraisal, ingest, processing, discovery, and delivery of email archives."

The software is comprised of four modules:
  1. Appraisal: Allows users to gather and review email archives 
  2. Processing: Tools to arrange and describe email archives.
  3. Discovery: Tools to share a view of email archives with users through web discovery 
  4. Delivery: Enables repositories to provide access within a reading room environment.
System Requirements:
  • OS: Windows 7 SP1 / 10, Mac OS X 10.10 / 10.11 
  • Memory: 8 GB RAM (4 GB RAM allocated to the application by default) 
  • Browser: Chrome 50/51, Firefox 47/48 
  • Windows installations: Java Runtime Environment 64-bit, 8u101 or later required
ePADD Installation and User Guide
ePADD Github website

Saturday, July 15, 2017

Email preservation: How hard can it be?

Email preservation: How hard can it be? Edith Halvarsson. Digital Preservation at Oxford and Cambridge. 7 July, 2017.
      The post summarises highlights of the Digital Preservation Coalition’s briefing on email preservation. What is email? It is "an object, several things and a verb”, a heavily linked and complex object, like the web. "Retention decisions must be made, not only about text content but also about email attachments and external web links. In addition, supporting features (such as instant messaging and calendars) are increasingly integrated into email services and potential candidates for capture."
Email is also a cultural and social practice; capturing relationships and structures of communication is an additional layer to preserve. 

What is being done, or can be done?  Migration is the most common approach to email preservation. EML and Mbox, which is a family of formats, are the most common formats migrated to. They have  different approaches to storing content. Others choose to unpack content which provides a way to display emails and normalise content within them. The emulation approach provides access to content within the original operating environment. Also, ePADD, an open source tool, provides functions for processing and appraisal of Mbox files, but ha other features

There are still questions and issues still to explore, particularly regarding web links. "Email archives may be more valuable to historians as they acquire critical mass".  Some thing that institutions can do are:
  • Participate with the  Email Preservation Task Force
  • Share your workflows to the Email Preservation Task Force and the community
  • Run trial migrations between different email formats such as PST, Mbox and EML and blog about your finding
  • Support open source tools such as ePADD and make them sustainable! 

Friday, July 14, 2017

Six Priority Digital Preservation Demands

Six Priority Digital Preservation Demands. Somaya Langley. Digital Preservation at Oxford and Cambridge. 13 July, 2017.
     Post discusses the gap between what activities need to be done as part of a digital stewardship end-to-end workflow and the maturity level of digital preservation systems. It presents a list of "my six top ‘digital preservation demands’ (aka user requirements)":
  • Integration with  other systems: A digital preservation ‘system’ is only one piece of a much larger puzzle. In the ‘digital ecosystem’  end-to-end digital stewardship workflows are of primary importance. Metadata and/or files should flow from one system to another.  
  • Standards-based:  Libraries rely on standards. "If we don’t use (or fully implement) existing standards, this means we risk mangling data, context or meaning; potentially losing or not capturing parts of the data; or just wasting a whole lot of time".
  • Error Handling: With more work and few people, we "have to be smart about how we work. This requires prioritisation." The preservation workflows need smarter systems to aid the processes, especially understanding and resolving errors from the many third-party tools. 
  • Reporting: The types of reports needed include: 
    • High-level reporting – annual reports, monthly reports, reports to managers, projections, costings etc.)
    • Collection and preservation management reporting 
    • Reporting for preservation planning purposes, based on preservation plans
  • Provenance: Support for identifying where a file has come from. This is often handled by metadata and documenting changes as Provenance Notes. The essential metadata (administrative, preservation, structural, technical) needs to be captured and retained.
  • Managing Access Rights:  We must ensure we can provide access to the content to support both the content and users in a variety of ways, particularly the new ways they want to use the content. 
"It’s imperative to keep in mind the whole purpose of preserving digital materials is to be able to access them...." Addressing these six concerns may not be easy, but we need to "make iterative improvements, one step at a time."

Thursday, July 13, 2017

Integrating Research Data management and digital preservation systems at the University of Sheffield

Integrating Research Data management and digital preservation systems at the University of Sheffield. Chris Loftus. Digital Preservation Coalition. 31 May 2017.
     The University Library is leading the active management and curation of research data within the institution. This includes implementing a research data catalogue and repository powered by Figshare. They safeguard library collections and University assets of the University using Rosetta, a digital preservation platform from Ex Libris. "We are now working with figshare and Ex Libris to integrate both services to provide seamless preservation of published research data across the research lifecycle." Which will

  • provide a complete lifecycle data management service for the university’s research community; 
  • identify, understand and act on risks associated with preserving data sets; 
  • better inform advice and guidance around use of data formats for sharing and preservation purposes; and 
  • encourage researchers to share their data more openly with others by guaranteeing the long term sustainability of that data.
Initial integration work uses the OAI-PMH protocol and METS packages to transfer content efficiently. Rosetta will be the dark archive, with figshare the interface for researchers and external users.

File formats issues: Research data is often in niche and proprietary formats. Of the material currently deposited in the archive, only a small percentage was recognised by a Droid survey. They will need to invest some time to identify and plan for these formats, and hopefully the work will be of use to the wider digital preservation community.

Metadata: They plan to improve the quality and volume of metadata accompanying research data. Material from researchers often lacks needed metadata, which can cause future data access issues. They are investigating solutions.