Showing posts with label Preservation model. Show all posts
Showing posts with label Preservation model. Show all posts

Friday, June 21, 2019

A new maturity model for digital preservation

A new maturity model for digital preservation. Jenny Mitcham. Digital Preservation Coalition blog. 20 June 2019.
     This blog discusses a new digital preservation maturity model, which is not yet available, that the DPC has been developing in a project with the UK Nuclear Decommissioning Authority (NDA). They wanted to "measure the NDA’s digital preservation maturity now. This is helpful to do at the start of any digital preservation journey, both to see where you are now, and to consider where you would like to be. The benchmarking tool could then be applied at the end of the project and at regular intervals further down the line to measure progress and review goals."  Digital preservation is usually implemented incrementally, so being able to map progress is incredibly valuable. The effort start with the maturity model created by Adrian Brown of the UK Parliamentary Archives, then make some substantial changes to it, such as changing the roadmaps, promoting the community element, and others. "Digital preservation is not a one-off activity and in an evolving field like this it is important to keep one eye on the horizon to see what is coming up and consider how to react."
The model, called the DPC Rapid Assessment Model, should be:
  • Applicable for all organizations
  • Applicable for all content of long-term value
  • Preservation strategy and solution agnostic
  • Based on existing good practice
  • Simple to understand and quick to apply


Tuesday, March 29, 2016

Exploring appraisal, quality assurance and risk assessment in the data continuum

Exploring appraisal, quality assurance and risk assessment in the data continuum.   Linda Ligios. Pericles blog. 8 March 2016.
     PERICLES presented a workshop on "appraisal, quality assurance and risk assessment in relation to the lives of complex digital objects."   It introduced the key concepts of :
  • model-driven preservation in a continually evolving environment
  • appraisal processes that lend themselves to being automated,
  • development plans for tools on appraisal, risk assessment and quality assurance 
Three main dimensions
  1. Risk – probability of an entity being non-usable
  2. Proximity – time frame in which we consider risk/impact
  3. Impact – potential loss of functionality and cost of mitigating actions
"Policies should always reflect the vision of the institution and therefore contain principles that are more aspirational in nature."  The  PERICLES model-driven preservation approach:

Model-driven preservation

Related topics:

Friday, October 09, 2015

Benchmarks for Digital Preservation tools

Benchmarks for Digital Preservation tools. Kresimir Duretec, et al. Vienna University of Technology and University of Toronto. October 2015.
     "Creation and improvement of tools for digital preservation is a di cult task without an established way to assess any progress in their quality." Software benchmarking is used to "provide objective evidence about the quality of software tools" but the digital preservation field is "still missing a proper adoption of that method." This paper looks at benchmarks and proposes a model for digital preservation and a standardized way to "objectively compare various software tools relevant to the digital preservation community."

Tuesday, August 11, 2015

Digital Preservation Tools on Github.

Digital Preservation Tools on Github. Chris Erickson. Blog. August 2015.
     While looking for a particular tool I came across several others that look interesting. I have not yet tried them, but this is a reminder that I need to check into them. 
  • epubcheck: a tool to validate EPUB files. It can detect many types of errors in EPUB. OCF container structure, OPF and OPS mark-up, and internal reference consistency are checked. EpubCheck can be run as a standalone command-line tool or used as a Java library.
  • preservation-tools: Bundles a number of preservation tools for all file types and tools in a modular way. Includes:
    • PdfHeaderChecker (able to detect the software used to create a PDF),
    • PdfAValidator (Checks via PDFBox if a PDF/A is valid. Runs through a folder and picks out only PDF/A-files),
    • iTextRepairPdf (take a PDF-file and copies the content page-per-page to a new, PDFA1-conform PDF-file)
    • PdfToImageConverter (Converts PDF Files in a certain folder to JPEGs page-per-page)
    • PdfTwinTest (compares the two PDF line-by-line and puts out differences. This is handy for after-Migration Quality-Checking)
  • wail: Web Archiving Integration Layer (WAIL). A graphical user interface (GUI) atop multiple web archiving tools intended to be used as an easy way for anyone to preserve and replay web pages.
  • db-preservation-toolkit. The Database Preservation Toolkit allows conversion between Database formats, including connection to live systems, for purposes of digitally preserving databases. The toolkit allows conversion of live or backed-up databases into preservation formats such as DBML or SIARD, XML-based formats created for the purpose of database preservation. The toolkit also allows conversion of the preservation formats back into live systems to allow the full functionality of databases. For example, it supports a specialized export into MySQL, optimized for PhpMyAdmin, so the database can be fully experimented using a web interface.
  • DPFManager. DPF Manager is an open source modular TIFF conformance checker that is extremely easy to use, to integrate with existing and new projects, and to deploy in a multitude of different scenarios. It is designed to help archivists and digital content producers ensure that TIFF files are fit for long term preservation, and is able to automatically suggest improvements and correct preservation issues. The team developing it has decades of experience working with image formats and digital preservation, and has leveraged the support of 60+ memory institutions to draft a new ISO standard proposal (TIFF/A) specifically designed for long term preservation of still-images. An open source community will be created and grown through the project lifetime to ensure its continuous development and success. Additional commercial services will be offered to make DPF Manager self-sustainable and increase its adoption.
  • PreservationSimulation. This project is to provide baseline data for librarians and researchers about long-term survival rates of document collections. We have developed computer simulations to estimate document failure rates over a wide variety of conditions. The data from these simulations should be useful to stewards of such collections in planning and budgeting for storage and bandwidth needs to protect their collections.
  • flint.  Facilitate a configurable file/format validation. Its underlying architecture is based on the idea that file/format validation almost always has a specific use-case with concrete requirements that may differ from a validation against the official industry standard of a given format. The following are the principle ideas we've implemented in order to match such requirements.
  • excel. Regarding the second issue: how to best retain formulas and other essential components of spreadsheets, like Excel, one of our data curators, John McGrory (U of Minnesota), just published a tool in GitHub that can help. In our data repository, we use the tool each time a dataset is submitted and zip these resulting files as the "Archival Version of the Data." Download the software at http://z.umn.edu/exceltool. See also a description of what the tool does: http://hdl.handle.net/11299/171966

Another successful work meeting on ontologies


Another successful work meeting on ontologies. Johannes Biermann. Pericles Blog. 10 August 2015.
The Centre for Research & Technology, Hellas (CERTH) hosted a technical workshop on the Linked Resource Model (LRM), the LRM service, domain ontologies, ecosystem model and policies.
Discussion of the progress on the domain ontologies and how the concept of “resource” has been adapted. Other topics included:
  • Art & Media domain ontologies
    Art & Media domain ontologies, to make them useable for the concrete scenario and for similar scenarios. 
  • The Science domain ontology uses Topic Maps to organise the knowledge with index and thesaurus 
  • The Ecosystem ontology has  a simplified graphical view of the ontology, with agents of different types, simplified dependency types,  significance of the resources, and refinement of the process and policy elements.
  • Linked Resource Model (LRM), including changes and the new features
  • Linked Resource Model service that includes dynamic parts that can be used to express time, activities, actions, versions and rules.
  • Roles of policies and how they are handled, the tools that drive the policies and how they can be tested. 


Wednesday, July 22, 2015

Information Governance: Why Digital Preservation Should Be a Part of Your IG Strategy

Information Governance: Why Digital Preservation Should Be a Part of Your IG Strategy. Robert Smallwood. AIIM Community. July 6, 2015.
     The post looks at Information Governance and digital preservation. The post author wrote the first textbook on information governance (IG). He used key models as part of this, such as the  Information Governance Reference Model (IGRM), E-discovery Reference Model and the OAIS model.  The question to answer is whether or not long term digital preservation should be a part of a information governance strategy.

Information Governance is defined as: 
a set of multi-disciplinary structures, policies, procedures, processes and controls to manage information at an enterprise level that supports an organization's current and future regulatory, legal, risk, environmental and operational requirements. 
  • "Long term digital preservation applies to digital records that organizations need to retain for more than 10 years."
  • digital preservation decisions need to be made early in the records lifecycle, ideally before creation.
  • Digital preservation becomes more important as repositories grow and age.

"The decisions governing these long term records - such as digital preservation budget allocation, file formats, metadata retained, storage medium and storage environment - need to be made well in advance of the process of archiving and preserving."

"All this data - these records - cannot simply be stored with existing magnetic disk drives. They have moving parts that wear out. The disk drives will eventually corrupt data fields, destroy sectors, break down, and fail. You can continue to replace these disk drives or move to more durable media to properly maintain a trusted repository of digital information over the long term."

If you move to a cloud provider that makes preservation decisions for you, then "you must have a strategy for testing and auditing, and refreshing media to reduce error rates, and, in the future, migrating to newer, more reliable and technologically-advanced media."

Your information governance strategy is incomplete if do not have a digital preservation strategy as well. Your organization "will not be well-prepared to meet future business challenges".

IGRM_v3.0