Friday, June 21, 2019

A new maturity model for digital preservation

A new maturity model for digital preservation. Jenny Mitcham. Digital Preservation Coalition blog. 20 June 2019.
     This blog discusses a new digital preservation maturity model, which is not yet available, that the DPC has been developing in a project with the UK Nuclear Decommissioning Authority (NDA). They wanted to "measure the NDA’s digital preservation maturity now. This is helpful to do at the start of any digital preservation journey, both to see where you are now, and to consider where you would like to be. The benchmarking tool could then be applied at the end of the project and at regular intervals further down the line to measure progress and review goals."  Digital preservation is usually implemented incrementally, so being able to map progress is incredibly valuable. The effort start with the maturity model created by Adrian Brown of the UK Parliamentary Archives, then make some substantial changes to it, such as changing the roadmaps, promoting the community element, and others. "Digital preservation is not a one-off activity and in an evolving field like this it is important to keep one eye on the horizon to see what is coming up and consider how to react."
The model, called the DPC Rapid Assessment Model, should be:
  • Applicable for all organizations
  • Applicable for all content of long-term value
  • Preservation strategy and solution agnostic
  • Based on existing good practice
  • Simple to understand and quick to apply


Thursday, May 30, 2019

How Archivists Saved Damaged WWII Film

How Archivists Saved Damaged WWII Film for 'The Cold Blue'. Chuck Thompson. Popular Mechanics. May 23, 2019.
    Shrinkage is the biggest problem with old film according to the article. To use original footage for a new movie, the archivist transferred 15 hours of 16mm film to 4K for the World War II documentary The Cold Blue. The film stock that was shot in 1943 has shrunk since it was created. Kodachrome maintains its vibrancy, but tends to lose pliability and moisture over time. All of the outtakes had shrunk to an average of 1.4 percent, which is "considered an immediate preservation risk. Once the film reaches that stage, it’s difficult to preserve the film photochemically because the pitch of the sprocket holes won’t seat accurately on the sprocket teeth of the printers, causing registration and stability issues on the new copy".  “Photochemical preservation” means preservation of a film by printing a new copy on new film stock and then developing and fixing the image  using traditional photographic processes.

The largest outtake reel had 36,880 frames, at 922 feet long, generating 2.6 TBs of data for 25 minutes of run time. The entire project generated just over 80 TBs of material. The preservation DPX files were wrapped with Bagger and written off  LTO-6 tape. The  three copies of the tapes: one in near-line storage, another is offline, and the last that is sent offsite to maintain geographical separation. The original film was returned to its 25-degree Fahrenheit vault, which slows down any deterioration that may continue.


Wednesday, May 29, 2019

Digital Data Storage Outlook 2019

Digital Data Storage Outlook 2019. SpectraLogic, May 2019. [Download]
    The fourth annual Data Storage Outlook report from Spectra Logic looks at the management, usage and storage of data. Some notes on data:
  • A 2018 IDC report predicts that the Global Datasphere will grow to 175 zettabytes (ZB) of digital data by 2025, though this report projects that much of this data will never be stored or will be retained for only a brief time.The amount of “stored” digital data is a smaller subset.
  • While there will be great demand for storage, a lack of advances in a particular technology, such as magnetic disk, means a greater use of other storage mediums such as flash and tape.
  • Increasing scale, level of collaboration, and diverse workflows are moving users from traditional file-based storage to object / web storage. Rather than attempting to force all storage into a single model, a sensible combination of both is the best approach.
  • There is a need for project assets to be shared across a team so they can be the basis for future work. An example is video footage that needs to be used by teams of editors, visual effects, audio editing, music scoring, color grading, and more. 
  • The lifetime of raw assets is effectively forever and they may be migrated across storage technologies many times

Tuesday, April 16, 2019

International Federation of Film Archives: Survey on Long-term Digital Storage and Preservation

Digital Statement PartV Survey on Long-term Digital Storage and Preservation. CĂ©line Ruivo and Anne Gant. FIAF Technical Commission, International Federation of Film Archives. April 2019.
    The sustainability of digital files and formats for long-term preservation has been a major concern in this field for almost two decades now. The FIAF Digital Preservation Principles, published in 2016, use the OAIS (Open Archival Information System) Model. Increasingly, film archives are publishing their own technical specifications online. Digitizing a film includes not only archiving a final result (the master), but also archiving the “raw files” which are uncompressed. Some of the results of the survey of 16 institutions who responded:
  • DPX is the main format used for preservation: 14 archives
  • TIFF is used as a second preservation format: 4 archives
  • Most use 4K resolution when they scan 35mm negatives for preservation 
  • Few have written technical specifications for the deposit of new digital acquisitions, which are mostly born-digital films.
  • Some archives use lossless compression for long-term preservation of a master to reduce storage space 
  • Some archives are considering implementing the FFV1 format this year for storing files. 
  • A checksum called framemd5 is integrated with the files MKV/FFV1.
  • The recording back to film of restorations is applied by 8 archives

Sound
    In terms of sound, digital formats are more variable than image formats, depending upon their
final distribution (cinema or TV broadcast). RAW formats are usually the same as the restored
files.
  • Most of the archives use a tape system for long-term conservation. 
  • They generally wait for 2 generations to migrate their data to reduce the cost 
  • Access storage is by a server that allows direct access to the files.
  • Most of the archives store and manage their files in their own facility. 

Conclusion
     This initial survey of the current digital landscape shows there is much more work to be done to get a global view of digital film archiving, and to hear from more archives at all stages in the development of digital workflows. Some conclusions that can be drawn from the current set of responses:
  1. There is a stabilization in language and a conceptual clarity emerging about the stages of a digital workflow within archives. The terms are becoming clear and are recognized as necessary parts of daily archival practice. This will allow for better information exchange and better comparison of workflows.
  2. There are some choices which seem to be predominant, such as 10 bit Log DPX, for example, or the use of ProRes, LTO, .wav files, etc. It is helpful to detail the reasons why certain archives chose uncommon formats or processes.
  3. There are reasons behind each archive’s choices, which make sense at the given moment. But it would be useful to revisit this survey in 5-10 years (or sooner), and see how digital film practices and archiving are progressing.


Thursday, March 28, 2019

A Public Record at Risk: The Dire State of News Archiving in the Digital Age

A Public Record at Risk: The Dire State of News Archiving in the Digital Age. Sharon Ringel and Angela Woodall. Columbia Journalism Review. March 28, 2019.
     This research report looks at archiving practices and policies across newspapers, magazines, wire services, and digital-only news producers, to identify the current state of preserving content in an age of digital distribution. The majority of news outlets had not given any thought to even basic strategies for preserving their digital content, and not one was properly saving a holistic record of what it produces. Digitization and storage in a database are not alone adequate for long-term preservation. True archiving requires forethought and custodianship.

Staff equate digital backup and storage in Google Docs or content management systems with archiving, but they are not the same, and were unable to distinguish between backups and an archive. Backups are temporary copies for data recovery in case of damage or loss, while archiving refers to long-term preservation to ensure records will still be available even as technologies change in the future. They expect that other third-party organizations will have copies, such as the Internet Archive, Google, Twitter, Facebook, etc. Even if the IA has captured a website, what it collects may be limited to the first level of content and could exclude links, comments, personalized content, and different versions of a story.

There are news archiving technologies being developed; preserving digital content is not a technical challenge, but  a matter of priority and a decision that demonstrates intent. The findings should be a wake-up call to an industry which claims that democracy cannot be sustained without journalism to be a truth and accountability watchdog. "In an era where journalism is already under attack, managing its record and future are as important as ever."

The news organizations are interested in the present: “Who cares what existed 10 years ago? I need my thing now. And so, for better, for worse, if there was some value in [archiving], I probably got a better value out of the new thing.” In short, newsrooms are doing very little to nothing to preserve digital news. And none of the content creators interviewed made an effort to download and preserve the stories they produced.

Deletion is the opposite side of preservation and "news organizations, in certain cases, actively remove content from the public record", which raises questions about the role of journalism in society.

Some key findings of the news organizations participating in the research:

  • 19 of the 21 news organizations had no policies or practices for the preservation of their content. Of the 21 news organizations in our study, 19 were not taking any protective steps at all to archive their web output. The remaining two lacked formal preservation strategies.
  • Of the 21 news organizations, only six employed news archivists or librarians and their other responsibilities, took the focus away from the work required for preservation. 
  • None of the digital-only outlets had a news librarian or archivist on staff. 
  • None of the news organizations were preserving their social media publications. Only one was attempting to address the problem.
  • Digital-only news organizations were less aware than print publications of the importance of preservation. Very little is currently being done to preserve news.
  • Journalism’s primary focus is on “what is new” and preserving documentation of their reporting and what makes it accurate than preserving what ultimately gets published.
  • News apps are at high risk of being lost because these new technologies become obsolete before anyone thinks to save them. 
  • Partnerships among archivists, technologists, memory institutions, and news organizations will be vital to ensure future access to digitally distributed news content. Two questions to start with: What should be preserved? Who should preserve it?
  • To enact lasting change, opinion leaders in the field must introduce to staff and management that archiving ideas make sense  positions, it has advantages, and is compatible with their priorities.

News organizations should care about preserving news for the future just as they care about integrity, reliability, and informing the public not just in the present.


Wednesday, March 27, 2019

Next Phase OAIS Review

Next Phase OAIS Review. Barbara Sierman. Digital Preservation Seeds. March 24, 2019.
     The OAIS standard had its 5-year review in 2017, which resulted in over 200 suggestions for change. All the changes were discussed in the DAI Working Group and and most of them were accepted. The next step is for the updated OAIS standard to go to CCSDS and ISO for final approval. The main part of the changes concerned terminology, both clarification and consistency.

Some concepts got a more extensive description, while others have changed. The new OAIS standard shows that a transparent process can lead to a standard that reflects the current practices. The standards group will now have a final opportunity to decide whether all suggested changes are clear and implementable.

Friday, March 22, 2019

Datanomics Costs, Benefits, and Value of Research Data

Datanomics: the value of research data. Neil Beagrie. Jisc Invitational Workshop, Glasgo., February 2019.
 Slides from presentation on Datanomics Costs, Benefits, and Value of Research Data. His description of the slides: 

Twenty years ago format obsolescence was seen as the greatest long-term threat to digital information.  Arguably, experience to date has shown that funding and organisational challenges are perhaps more significant threats. I hope this presentation helps those grappling with these challenges and shows some key advances in how to use knowledge of costs, benefits and value to support long-term sustainability of digital data and services.

These are the slides from my keynote presentation to the joint Digital Preservation Coalition / Jisc workshop on Digital Assets and Digital Liabilities - the Value of Data held in Glasgow in February 2018. The slides summarise work over the last decade in the key areas of exploring costs, benefits and value for data. The slides posted here have additional slide notes and references to new publications since the workshop and some modifications such as removal of animations.

Some notes from the slides:
Costs. Keeping Research Data Safe (KRDS)  rules of thumb.
  1. Getting data in takes about Half of the lifetime costs, Preservation about a sixth, access about a third. 
  2. Preservation costs decline over time. 
  3. Fixed costs are significant for most data archives 
  4. Staff are the most significant Proportion of archive costs.

The KRDS Benefits Framework. Benefit from Curation of Research Data. Framework arranged on 3 dimensions.
THE ANATOMY OF A BENEFIT Triangle
  1. What is the outcome?
  2. When is it received?
  3. Who benefits?
Valuing Intangible Assets: Measuring value of intangible assets is much harder than for physical assets. We measure value of data services not just data alone

Economic Metrics Used
  • Investment Value Amount spent on producing the good or service
  • Use Value Amount spent by users to obtain the good or service
  • Contingent value: the amount users are “willing to accept” in return for giving up access
  • Efficiency gain: user estimates of time saved by using the Data Service resources
  • Return on investment: the estimated increase in return on investment due to the additional use
Must also look at the Costs of Inaction
  • Rate of loss of research data sets: 17% per annum
  • Partial information loss: 7% per annum
  • Rate of loss for web-links to data: c. 5.5% per annum
  • Access / Data requests fulfilled
  • Delay in elapsed time to fulfill data requests. Up to 6 months

Recommendations: Investigate the relative costs and benefits of curation levels, storage, or appraisal for what to keep.

“Five or six decades since the beginning of the Information Age, the namesake of this age, and the major asset driving today’s economy, is still not considered an accounting asset”

“Corporations typically exhibit greater discipline in tracking and accounting for their office furniture than their data”

Conclusions:
Use cost data to look for trends, leverage our efforts, investigate the relative costs and benefits of curation levels, storage, and look towards hierarchical curation management.


Monday, March 11, 2019

Arctic World Archive receives more world treasures

Arctic World Archive receives more world treasures. Press release. 21. February 2019.
     Institutions and companies from around the world, including Utah Valley University, have deposited their digital content in the Arctic World Archive in Svalbard, Norway.  The Archive is a repository for world memory where the data will last for centuries.  The Archive is a collaboration between Piql, digital preservation specialists, and Store Norske Spitsbergen Kulkompani (SNSK), a state-owned Norwegian mining company based on Svalbard with vast experience and resources to build and maintain mountain vaults.

The top 10 items of cultural heritage, as nominated by the public was also stored away for the future. These items include famous religious texts, paintings, architectural designs, science breakthroughs and popular contemporary music. 


See also:

Saturday, March 09, 2019

What to Keep: A Jisc research data study

What to Keep: A Jisc research data study. Neil Beagrie. Jisc. February 2019.  [PDF]
     This study is about research data management and also appraisal and selection. This is an issue that has become more significant in recent years as volumes of data have grown. "The purpose is to provide new insights that will be useful to institutions, research funders, researchers, publishers, and Jisc on what research data to keep and why, the current position, and suggestions for improvement."

"Not all research data is the same: it is highly varied in terms of data level; data type; and origin. In addition, not all disciplines are in the same place or have identical needs."

"It is essential to consider not only What and Why to keep data, but for How Long to keep it, Where to keep it, and increasingly How to keep it in ways that reflects its potential value, cost, and available funding."

The study lists ten recommendations:
  1. Consider what is transferable between disciplines. Support adoption of effective practice via training, technologies, case studies, and guidance checklists.
  2. Bring communities together with workshops to evolve disciplinary norms 
  3. Harmonise funder requirements for research data where relevant
  4. Investigate the costs and benefits of curation levels, storage, or appraisal for what to keep f
  5. Implement the FAIR principles as appropriate for kept data.  
  6. Enhance data discoverability and identification of data by recording and to identifying data  generated by research projects in existing research databases.
  7. Require Data Access Statements in all published research articles where data is used as evidence, and encourage adoption of the Transparency and Openness Promotion (TOP) guidelines 
  8. Improve incentives and lower the barriers for data sharing.
  9. Increase publisher and funder collaborations around research data. 
  10. Improve communication on what research data management costs can be funded and by whom
Definition of research: "a process of investigation leading to new insights, effectively shared. It includes work of direct relevance to the needs of commerce, industry, and to the public and voluntary sectors; scholarship ...; the invention and generation of ideas, images, performances, artefacts including design, where these lead to new or substantially improved insights; and the use of existing knowledge in experimental development to produce new or substantially improved materials, devices, products and processes, including design and construction.”

Other notes from the study:
Costs of research data management seen as too high
Obsolescence of data format or software

The volume of research data and the number of new research data services and repositories is increasing.

"The high-level principles for research data management may be established but the everyday practice and procedures for the full-range of research data, what and why to keep, for how long, and where and how to keep it, are still evolving."

“All those engaged with research have a responsibility to ensure the data they gather and generate is properly managed, and made accessible, intelligible, assessable and usable by others unless there are legitimate reasons to the contrary. Access to research data therefore carries implications for cost and there will need to be trade-offs that reflect value for money and use.”

The Core Trustworthy Data Repositories Requirements notes four curation levels that can be performed by trusted repositories:
a. As deposited
b. Basic curation eg, brief checking, addition of basic metadata or documentation
c. Enhanced curation eg, conversion to new formats, enhancement of documentation
d. Data level curation (as in C above, with additional editing of data for accuracy)


UVU Cylinder Project

UVU Cylinder Project. Website. Utah Valley University. March 8, 2019.
     This website and the cylinder project was showcased at a digital preservation symposium. This site has an extensive searchable library of cylinders, and a Cylinder Player. The un-archived cylinders are in the process of being transcribed, metadata added, recordings being cleaned and then posted to the site.


Wednesday, March 06, 2019

Texas Digital Library Digital Preservation Services

Texas Digital Library Digital Preservation Services. Press release. Texas Digital Library, 5 March 2019. [PDF]
     The organization now offers Digital Preservation Services to its members to help Texas cultural heritage and scholarship stewards provide access for the long term through direct consulting, training, and workflow support that includes the right combination of technologies for your unique content needs. The content can be stored in multiple geographically-dispersed locations with fixity checking with Chronopolis and Amazon through the DuraCloud interface.

Tuesday, March 05, 2019

Accessible Archives Inc. Partners with Portico

Accessible Archives Inc. Partners with Portico. Press release. Accessible Archives, Inc. Mar 05, 2019.
     Accessible Archives Inc., an electronic publisher of full-text primary source historical databases, has partnered with Portico, in order to fully support the digital preservation of their content. With more content in a digital-only format, this will help the archival collections remain accessible Preservation will help to ensure the long-term availability of these resources for future scholars.

Monday, January 28, 2019

Introduction to Digital Preservation: What is Digital Preservation?

Introduction to Digital Preservation: What is Digital Preservation? Bodleian Libraries, Oxford LibGuides. Aug 28, 2018.    
     Digital preservation at Bodleian Libraries is defined as: "The formal activity of ensuring access to digital information for as long as necessary. It requires polices, planning, resource allocation (funds, time, people) and appropriate technologies and actions to ensure accessibility, accurate rendering and authenticity of digital objects.  A “lifecycle management” approach to digital preservation is taken, where action is done at regular intervals and future activity is planned. This includes policies and recommendations for appraising and selecting digital information to preserve, acknowledging resources are finite."

There are two different kinds of digital preservation:
  1.  Bit-level Preservation: a "very basic level of preservation of the digital object as it was submitted (literally preserving the bits forming a digital object)." It is a beginning step to the more complete set of digital preservation practices and processes that ensure the survival and usability of digital materials over time.
  2. Logical Preservation: The part of preservation management that ensures the continued usability of content by ensuring the existence of a usable manifestation the digital object. Sometimes  referred to as format preservation or active preservation, it includes
  • Understanding what digital materials are in the repository.
  • Identifying threats to the materials and planning actions to be taken for at-risk digital materials
  • Putting things into action 
Defining other terms:
  • "Digital curation involves maintaining, preserving and adding value to digital files throughout their lifecycle—not just at the end of their active lives. This active management of digital files reduces threats to their long-term value and mitigates the risk of digital obsolescence. Digital curation includes digital preservation, but the term adds the curatorial aspects of: selection, appraisal and ongoing enhancement of the object for reuse."It is commonly used in the science and social sciences for research data and is often being replaced with research data management, especially when referring to active digital files.
  • Digital archiving is often used interchangeably with digital preservation in archives. It has two main definitions used by computing personnel and archivists and librarians respectively. Recognize both definitions of the term and be aware of the audiences that use this term differently.
    1. The process of storage, backup and ongoing maintenance as opposed to strategies for long-term digital preservation
    2. the long-term storage, preservation and access to information that is "born digital" or for which the digital version is considered to be the primary archive 
  • Digital Stewardship, more commonly used in the US, "combines both curation and preservation—the active life of a digital asset and its continual preservation afterwards for long-term use. But this school of thought splits digital curation & digital preservation into two separate categories and then uses digital stewardship as the umbrella term."
The Bodleian Librarians consider digital preservation to be a "holistic term that includes aspects of digital curation and stewardship". They work with creators to organise and manage their digital objects to preserve them, to follow best practice for creation and managing active files so that they will be easier to manage and provide access to in the long-term.


See also:

Wednesday, December 19, 2018

Nothing succeeds like success: An approach for evaluating digital preservation efficacy

Nothing succeeds like success: An approach for evaluating digital preservation efficacy. Stephen Abrams. Paper, iPres 2018. [PDF]
     Digital preservation encompasses the theory and practice ensuring purposeful future use of digital resources. But how can one tell whether it has been effective or not? The evaluation of the effectiveness of preservation actions has two dimensions: trustworthiness of managerial programs and systems; and successful use of managed resources.
  • Preservation should be viewed as facilitating meaningful communication across time and cultural distance.
  • The preservation field has not yet matured to a point of having established metrics for evaluating the success or failure of its outcomes
  • We should be asking what measures can be used to evaluate the success of the digital preservation efforts
  • A proper model for digital preservation should be viewed as human communication rather than data management and evaluating success through operational, not just descriptive evidence. 
  • The goal of that communication is to transfer an intangible but intentional unit of meaning from the producer to a consumer across temporal, technical, and cultural distance
  • Like any formal discipline, digital preservation should be viewed as a complex of actors, policies, technologies, and practices; its maturity is dependent on its capacity for reflective self-evaluation
  • There are two primary measures of preservation efficacy: trustworthiness of managerial systems and programs; and successful use of preserved resources.
  • Because of the open-ended time horizon of preservation commitments, preservation success should be understood as a provisional, rather than absolute value. One can’t make categorical assertions beyond the ever-forward-moving point of now, since the consequences of the future cannot be fully anticipated
  • A model of the digital preservation enterprise provides a way to analyze, explicate, and understand the domain. It can lead to new criteria and metrics for evaluating success. It also will form the basis for rational prioritization of strategic goals, allocation of programmatic resources, and transparent accountability to stakeholder communities.

Friday, December 14, 2018

In-House Digitization with the Lossless FFV1 Codec At the University of Notre Dame Archives. AMIA Poster

In-House Digitization with the Lossless FFV1 Codec At the University of Notre Dame Archives. Erik Dix and Angela Fritz, University of Notre Dame Archives. AMIA 2018. Poster. [pptx].
     An interesting poster at AMIA which shows their digitization workflow and processing steps from accessioning to preservation system.
WHY FFV1 as a codec for Digital Preservation Masters?
1. Lossless compression (no quality loss)
2. A Standard Definition FFV1 file is ca. 46 % of the size of the uncompressed file.
    A High Definition FFV1 file is ca. 57 % of the size of the uncompressed file.
3. FFV1 is part of the FFmpeg project and open source
4. It is safe for long term preservation.
5. Encoding into FFV1 can be done with low cost Windows PCs.
6. The video is captured in FFV1 in real time.
7. Standard definition FFV1 files can be played with the VLC media player

Digitization Workflow

Accessioning as Processing:
  • Archives conducts a preliminary inventory, assigns collection code,  creates CMS record 
  • AV materials transferred to AV Archivist for a preservation and digitization assessment
  • Descriptive and technical metadata gathered
  • Analog materials reorganized and stabilized for long-term storage.
Basic Metadata Creation:
  • AV Archivist creates item–level metadata
  • Descriptive and technical metadata promotes access and discoverability
  • Descriptive metadata added to finding aid and uploaded to the Archives, the IR, and ILS
Inspection & Prep of A-V materials:
  • Only requested AV items or at-risk items will be digitized 
  • Videotapes often require baking or splicing
VCRS without SDI output:
  • The digitization capture card uses SDI [Serial Digital Interface] connections. 
  • VHS, Betamax, and older professional formats, e.g. 1” type C, U-matic don’t have SDI outputs. 
  • A DPS-575 frame synchronizer is used to create a SDI signal from the S-video or output of these items
  • Basic color correction is done at this step if necessary. 
  • The SDI signal from the frame synchronizer is then split in two to feed a Windows PC for the creation of the FFV1 preservation file and to feed a Mac computer to create an Apple ProRes 422 mezzanine file.
VCRs with SDI output:
  • They have VCRs for the DV tape family from Mini DV up to DVCPro HD, DVCam, and HDV, as well as the Betacam tape family from Betcam to HDcam that can output an SDI or HD-SDI signal. 
  • The signal is also split in two to simultaneously create a FFV1 file on a Windows Pc and an Apple ProRes 422 file on a MAC.
Digital Preservation System:
  • Use an LTO tape library for the storage of our digitized files. 
  • Currently, the Archives is evaluating digital preservation systems for implementation. 
  • Archives capabilities will be expanded to provide digital preservation micro-services to ensure continued access to its digital collections.


Thursday, December 13, 2018

Why Is the Digital Preservation Network Disbanding? Lessons from organizational challenges

Why Is the Digital Preservation Network Disbanding? Roger C. Schonfeld. The Scholarly Kitchen; Society for Scholarly Publishing. Dec 13, 2018.
     "The long term stewardship of digital objects and collections through digital preservation is an essential imperative for scholarship and society. Yet its value is intangible and its rewards are deferred. It falls on organizations to invest in preservation, often less out of a sense of anticipated exclusive returns and more out of a sense of contributing to a community mission." It is essential that we discuss the lessons we can learn from organizational challenges.

DPN was a commitment to replicate the data of research and scholarship across diverse environments and to enable existing preservation capacity. It offered an elegant technical solution but the product offering was never as clear as it could have been, and ultimately could not be sustained. Most DPN members did not use the network services and membership declined. Some patterns emerged: 
  • Not every storage need requires a preservation solution, and the members were "in some cases, unsuccessful in distinguishing the added value of a preservation solution from cloud storage."
  • Many library systems were not originally prepared to support DPN’s ingest workflow. For a number of members, the content to be preserved was spread across servers and systems, often with limited curatorial control. 
  • The product definition took too long to emerge and the value proposition was not uniformly understood.
  • DPN’s pricing model did not generate the revenue that DPN’s model anticipated. 
  • Some libraries signed up more out of courtesy or community citizenship than commitment.
  • Membership models are ill-suited to product organizations and marketplace competition.  
There are broader implications in the disbandment of DPN. The article states that  DPN will not be the last closure, merger, or other reorganization. "It seems clear that we are in a period of instability for collaborative library community efforts and more major changes are surely on the horizon."

Wednesday, December 12, 2018

Preservation of AV Materials in Manuscript Collections. Training for AV format identification and risk assessment. Actions to take


Preservation of AV Materials in Manuscript Collections; Internal Training.  Ben Harry. Brigham Young University. November 2018.
     The presentation is not yet available on the internet. Some notes from the training:
“There is now consensus among audiovisual archives internationally that we will not be able to support large-scale digitisation of magnetic media in the very near future. Tape that is not digitised by 2025 will in most cases be lost.”  -NFSA.gov.au, Oct. 2018

The problem with AV is Fragility:
  • Playback equipment is disappearing
  • Knowledgeable experts are disappearing
  • Materials breaking down
  • Untrained handling easily destroys materials
The solution to the fragility is to address materials in a timely manner:
  • Priority and Speed and Efficiency
  • Train transfer operators
  • Untrained handling easily destroys materials
A Challenge of AV is Neglect:
  • Unable to describe AV Content adequately in finding aids or catalogs. 
  • Requires certain level of specific knowledge of formats and physical carriers.
  • Requires machine to read information that may not be available
  • Time-consuming process for little reward
  • Expensive, unstructured, uncoordinated
To overcome the challenge:
  • Digitize material for description in basic processing
  • Time-consuming process for little reward
  • MUST be a lean process to minimize the effect upon processing
Audio-video preservation requires a certain level of specific knowledge. Staff must be trained to recognize and report AV Formats. Also, it is important to have risk assessment guidelines to help make informed decisions. Coordinate efforts and resources to reduce confusion, prioritize and set goals, unify our proposals for equipment and man power.


Actions to take:
  • Prioritize Formats for Migration / Reformatting
  • Maintain Transparent Records on Preservation and Access
  • Link Preservation and Access (one does not happen without the other)
  • Provide Curators with AV Assessment tools
  • Organize a Queue System to keep things equitable (what about 12 items per month, per curator? Adjust as Necessary)
  • Create Digital File Naming guidelines
  • Establish Access and Preservation format standards for AV materials:
 For Access and Preservation, the following standards will be used:

Audio Preservation
  • Preservation Format:  PCM / wav 96 kHZ sampling   24-bit depth. 1 GB/Hour
  • Access Copy: mp3.  Music: 256 kbps. Voice: 192 Kbps.

Video Preservation: Standard Def
  • Preservation Format: ffv1 / mkv 720 x 486. 33 GB/Hour  
  • Access Copy: H.264 / mp4

Video Preservation: Hi Def
  • Preservation Format: ffv1 / mkv Native: 1080i / 1080p. 100 GB/Hour?  
  • Access Copy: H.264 / mp4

Film Preservation
  • Preservation Format: RGB ffv1 / mkv 1080i scan (MPS capability ceiling). 100 GB/Hour?  
  • Access Copy: H.264 / mp4

Archive and delivery methods:
  • Preservation: Rosetta
  • Access: various options are available. 


Monday, December 10, 2018

A Preservationist’s Guide to the DMCA Exemption for Software Preservation

A Preservationist’s Guide to the DMCA Exemption for Software Preservation. Kee Young Lee and Kendra Albert. Software Preservation Network and the Cyberlaw Clinic @ the Berkman Klein Center. December 10, 2018.  [PDF]
     "The Library of Congress recently adopted several exemptions to the Digital Millennium Copyright Act (DMCA) provision prohibiting circumvention of technological measures that control access to copyrighted works. The exemptions went into effect on October 28, 2018 and last until October 28th, 2021. This guide is intended to help preservationists determine whether their activities fall under the new exemption."  The Software Preservation Network has obtained temporary exemptions which remove the legal liability for circumventing technological protection measures for preserving the software or resulting files, provided that certain conditions are met. These exemptions do not remove legal liability for copyright infringement of the underlying software itself.

The guide provides excellent information on the issues and the exemptions. The exemptions are  generally directed to preservation activities  by libraries, archives, and museums, but there are five criteria required in order to claim the exemption. The library, archive, or museum must:
  1. Make its collections open to the public or routinely available to unaffiliated outside researchers.
  2. Ensure that its collections are composed of lawfully acquired or licensed materials.
  3. Implement reasonable digital security measures for preservation activities.
  4. Have a public service mission.
  5. Have trained staff or volunteers that provide services normally provided by libraries, archives, or museums
In addition, there are requirements for using the preserved software:
  • The computer program must have been lawfully acquired.
  • The software must no longer be reasonably available in the commercial marketplace.
  • The sole purpose of the circumvention activity must be for lawful preservation of the computer program or digital materials that are dependent on a computer program.
  • The computer programs cannot be used for commercial advantage.
  • Use of the exemptions can only be for non-infringing uses of the software.
  • Copies of the computer programs cannot be made available outside of the physical premises of the library, archive, or museum.
These exemptions are only for three years, so evidence of software preservation activities will help to renew the exemptions.

The Guide also includes a DMCA Exemption for Software Preservation Checklist.


Saturday, December 08, 2018

Make The Case for Digital Preservation in Your Organisation

Make The Case for Digital Preservation in Your OrganisationDigital Preservation Coalition. 2018.
     This page provides some useful guides, examples and other resources that can help with building a business case and more broadly making the case for digital preservation within your organisation.
When preparing a business case or briefing, these resources will provide useful an array of helpful information to assist in the construction of a business case, from planning and preparation all the way through to polishing and communicating the finished case for digital preservation in your organisation.
Communication is critical for understanding your stakeholders and creating a foundation for establishing digital preservation within your organisation. These resources provide guidance on engaging with particular audiences

Thursday, December 06, 2018

3 Principles for Selecting a Digital Preservation Solution


3 Principles for Selecting a Digital Preservation Solution. Daniel Greenberg. Ex Libris. November 29, 2018.
   This post was in honor of World Digital Preservation Day and lists some important elements to remember when reviewing digital preservation systems:

1. Interoperability different types of data and integrating with other systems
  • Support common protocols for harvesting, publishing and searching, e.g. Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) and SRU (Search/Retrieve via URL).
  • Ingest content with multiple methods and structures; e.g., BagIt, METS, CSV, and XML.
  • Providing well-documented external APIs
  • Integrating with other information systems
2. Follow Industry Standards, particularly standard metadata schemas and communication protocols. Benefits of doing this:
  • Interoperability between new and existing services and applications.
  • Compliance with policies and regulations.
  • Introduction of innovative features.
  • Enable a robust exit strategy, in case the vendor goes out of business.
3. Scalability:
  • Architectural scalability: Start small and grow big. Ability to expand the throughput over time without compromising performance.
  • Operational scalability: Ability to customize the system to the institutions’ needs.
  • Informational scalability: Keep up with latest strategies, practices, tools and policies by an active user community.
  • Organizational scalability: Administer multiple institutions with a single installation; support a flexible consortium model.

Wednesday, December 05, 2018

Digital Preservation Network (DPN) Sunset

Community Announcement - DPN SunsetDigital Preservation Network. December 4, 2018.
     The Digital Preservation Network’s Board of Trustees of DPN are ending DPN.  The DPN Board "determined that it is not feasible to design and implement changes that would ensure sustainability." 
"The landscape of digital preservation services has changed considerably in the past six years, as have the community’s preservation needs. Our highest priority is to affect an orderly sunset for the organization’s operations and for the disposition of its deposits."

The ending of a community-based organization to provide long-term digital preservation storage  highlights the numerous challenges with maintaining digital resources long-term.


Thursday, November 29, 2018

The File Discovery Tool - A simple tool to gather file and filepath information, and ingest into our Rosetta Digital Archive


The File Discovery Tool. Chris Erickson. Brigham Young University. November 29, 2018.
     We have created a File Discovery Tool that analyzes directories of objects and prepares a spreadsheet of all the files it discovers for preservation/ingest. This file allows the curators to discover and work with the materials, select those that need to be preserved, and then add collection and other metadata information. The tool fits our workflow, but the source code may be useful for others trying to accomplish a similar task.

A sample command to run the tool:
>> java -jar FileDiscovery.jar [path name of files to check] [output path name for saving the report]
>> java -jar C:\FileDiscovery\FileDiscovery.jar "R:\test\objects"  C:\output\files
 The commands and syntax are outlined in a brief document: File Discovery Outline
  
The spreadsheet that is created has the following column headings:
 FILENAME, ITEM ID, FILEPATH, BYTESIZE, SIZE, COLLECTION, IE_LEVEL, DATE_CREATED, DATE_MODIFIED, TITLE, CREATOR, DESCRIPTION, RIGHTS_POLICY

Metadata can be added as needed before ingesting the content into Rosetta.

The files and the metadata can then be submitted to Rosetta using the csv option in the Rosetta File Harvester tool by adding in a second row of Dublin Core names in order to map the column. A standard template has been created to help in preparing the file for ingest: Rosetta File Ingest template (PDF)
The source is available at https://bitbucket.org/byuhbll/filediscovery


The File Harvester tool - Our tool for ingesting content to our Rosetta Digital Archive


The File Harvester tool. Chris Erickson. Brigham Young University. November 29, 2018. 
     We have created a harvester tool for harvesting, processing, and submitting content to Rosetta. Our Library IT department has made this open source. The tool fits our workflow, but the source code may be useful for others trying to accomplish a similar task.

The File Harvester tool gathers content from several different sources:
  • Our hosted CONTENTdm (cdm)
  • Open Journal System (ojs)
  • Internet Archive (ia)
  •  Unstructured files in a folder with metadata in a spreadsheet (csv)
The tool creates SIPs by adding objects and metadata from the specified source, by creating a Rosetta mets xml file and a Dublin core xml file; and by putting it in the structure for our Rosetta system. The objects can either be on the hosted system or in a source folder. The harvest tool can also submit the content to Rosetta for ingest.

The structure is:
  1. Folder: collection-itemid and it contains the dc.xml and subfolder content 
  2. Sub-Folder: content and it contains the mets.xml and the folder streams 
  3. Sub-Folder: streams which contains the file objects
The commands and syntax are outlined in a brief document:
RosettaFile Harvester outline

The source is available at: https://bitbucket.org/byuhbll/rosetta-tools