Saturday, January 31, 2015

Phase Two of POWRR: Extending the Reach of Digital Preservation Workshops

Phase Two of POWRR: Extending the Reach of Digital Preservation Workshops. Danielle Spalenka. January 27, 2015.
     The Digital POWRR Project (Preserving digital Objects with Restricted Resources) will continue the POWRR workshops for two years.

Project team members realized that many information professionals feel overwhelmed by the scope of the digital preservation problem, which prevents them from implementing digital preservation activities. They found that digital preservation is best thought of as an incremental, ongoing, and ever-shifting set of actions, reactions, workflows, and policies. Digital preservation activities can be started by taking small steps to prioritize and triage digital collections, while working to build awareness and advocate for resources.

Some of the resources on the site include: 

Friday, January 30, 2015

Memorial University of Newfoundland selects Ex Libris Solutions, including Rosetta

Memorial University of Newfoundland selects Ex Libris Solutions, including Rosetta. Press Release. Ex Libris. January 28, 2015.
The Memorial University in Newfoundland and Labrador, Canada has adopted a suite of Ex Libris solutions comprised of the Alma library management solution, the Primo discovery and delivery solution, and the Rosetta digital asset management and preservation system. These solutions replace multiple disparate legacy systems used by the Library.

Rosetta will enable Memorial University to manage and preserve its important collections of Newfoundland's history, including a huge collection of digitized newspapers. Using the Primo search interface for physical collections, digital and digitized assets, and electronic resources, Memorial will provide a seamless discovery experience to users, whatever their learning and teaching needs. 

Thursday, January 29, 2015

University of Arizona selects Ex Libris Rosetta.

University of Arizona selects Ex Libris Rosetta. Press Release. Ex Libris. January 27, 2015.
The University of Arizona has adopted the Rosetta digital management and preservation solution. Rosetta will help the university provide sustained access to scholarly digital content and research to both university members and the broader academic community.

"After evaluating a number of commercial digital preservation systems, we found that Rosetta had the unique capabilities that Arizona requires. Our priorities for 2015 led us to seek a preservation solution that could be used collaboratively by a number of campuses. Rosetta's ability to provide end-to-end digital asset management and preservation for the vast array of assets and research data that the university possesses, its consortial architecture that allows participating institutions to maintain a degree of autonomy, and its ability to act as a transitional component between multiple display layers, made it the clear choice for Arizona."

Monday, January 26, 2015


ForgetIT. Website. January 23, 2015. 

While preservation of digital content is now well established in memory institutions such as national libraries and archives, it is still in its infancy in most other organizations, and even more so for personal content. ForgetIT combines three new concepts to ease the adoption of preservation in the personal and organizational context: 
  1. Managed Forgetting: resource selection as a function of attention and significance dynamics. Focuses on characteristic signal reduction. It relies on information assessment and offers options such as full preservation, removing redundancy, resource condensation, and complete digital forgetting. 
  2. Synergetic Preservation: making intelligent preservation processes a part of the content life cycle and by developing solutions for smooth transitions.
  3. Contextualized Remembering: keeping preserved content meaningful by combining context extraction and re-contextualization.
The main expected outcomes are the flexible Preserve-or-Forget Framework for intelligent preservation management and, on top of it, two application pilots: one for personal preservation focusing and one for organizational preservation. It is an important step for managing and preserving new forms of community memory and cultural history. It is also an alternative to the “keep it all” approach in our digital society.

Digital Curation Foundations

Digital Curation Foundations. Stephen Abrams. California Digital Library. January 20, 2015. (PDF).
Digital curation is a complex of actors, policies, practices, and technologies that enables meaningful consumer engagement with authentic content of interest across space and time. The UC Curation Center defines its mission in terms of digital curation, rather than digital preservation because that better expresses the need for coordinated activities of preservation of, and access to, managed assets. It also reflects the idea of ongoing enrichment of managed content rather than just maintaining the content over time. This should ideally start before the assets are created. The approach is more on services than systems, and those services should be delivered at the place they are needed.

The laws of library science deal with use, service to the users, and ongoing change. Every asset should be curated in order to be used. They should be able to be used when and where the user needs them and in accordance with the user's expectations. The digital curation activities must not only be sustainable, but capable of evolving to meet every changing needs as well as risks. This kind of service requires "administrative, financial, and professional support.

Curation decisions should be made with respect to an underlying theory or conceptual domain model based on first principles. The ultimate goal of digital curation is to deliver content. The digital curation field has reached a stage of maturity where it can usefully draw upon a rich body of theoretical research and practical experience. 

  • The curation imperative: providing highly available, responsive, comprehensive, and sustainable services for access to, and use and enhancement of, authentic digital assets over time.
  • The primary unit of curation management is the digital object
  • The true focus of curation is the underlying information meaning of the objects. "In other words, bits are the means, content is the ends."
Curation services include:
  • Creation / acquisition.
  • Appraisal / selection.
  • Preservation planning.
  • Preservation intervention.
  • Selection of appropriate curation service providers
  • Appropriate micro services

Sunday, January 25, 2015

X-ray technique reads burnt Vesuvius scroll

X-ray technique reads burnt Vesuvius scroll. Jonathan Webb. BBC News. 20 January 2015.
Scientists are using a 3D X-ray imaging technique to read rolled-up scrolls buried by Mount Vesuvius that can distinguish the ink from the paper. The technique has identified a handful of Greek letters within a rolled-up scroll. [BYU has used multi-spectral imaging to read the blackened unrolled scroll fragments. More here.] The X-ray phase-contrast tomography technique looks at the bumps on the paper rather than chemicals in the ink that yielded the long-hidden letters. The letters are slightly raised, the ink never penetrated into the fibres of the papyrus, but sat on top of them. Curved letters that stand out from the papyrus fibres are easier to identify than square ones.

Saturday, January 24, 2015

Video Games and the Curse of Retro

Video Games and the Curse of Retro. Simon Parkin. New Yorker. January 11, 2015.
 Almost two and a half thousand MS-DOS computer games have been added to the Internet Archive game collection (which says that "Through the use of the EM-DOSBOX in-browser emulator, these programs are bootable and playable.") The archive has rescued historical games which are unplayable unless you also have the original hardware.

Video games are more prone to obsolescence than other digital products. When hardware and software change, many games become unplayable. Unlike other digital media, video games rely on audiovisual reproduction and on a computer’s ability to execute the coded rules and instructions. Game publishers may not have an incentive to maintain older games, so they become obsolete.

Britain’s National Media Museum established the National Videogame Archive, which aims to “preserve, analyse and display the products of the global videogame industry by placing games in their historical, social, political and cultural contexts.” The Internet Archive, by contrast, makes games playable online. The games are part of our social, political, and cultural context. “We risk ending up in a ‘digital dark age’ because so much material that defines our current era is immaterial and ephemeral.” This is the motivation for many video-game preservationists: save everything before it’s lost, and let the future decide what matters in the long run.

Friday, January 23, 2015

The Dataverse Network

The Dataverse Network. Harvard Dataverse Network. 2014.
The Dataverse Network is an open source application to publish, share, reference, extract and analyze research data. It facilitates making data available to others and to replicate work of other researchers. The network hosts multiple studies or collections of studies, and each study contains cataloging information that describes the data plus the actual data and complementary files.

The Dataverse Network project develops software, protocols, and community connections for creating research data repositories that automate professional archival practices, guarantee long term preservation, and enable researchers to share, retain control of, and receive web visibility and formal academic citations for their data contributions.

Thursday, January 22, 2015

Fighting entropy and ISIL, one image at a time

Fighting entropy and ISIL, one image at a time. Whitney Blair Wyckoff. FedScoop. December 10, 2014.
United States security is generating so much data that traditional disk media is being pushed to its limits, requiring new technologies to safely store all that information. Hitachi Data Systems has a new technology to preserve information on disks in an infinitely expandable array. This platform uses Blu-ray XL M-DISCs that resist environmental conditions and can last for more than 1,000 years. The M-DISC optical solutions have proven survivability and durability. This system represents both "the highest reliability as well as the lowest overall cost of ownership representing superior savings in power, footprint and data reliability."
The IT can supplement magnetic storage with optical media to create a preservation tier that enables IT managers to migrate data when they want, not when the technology or media forces them.  This saves money and allows for more strategic long term planning. Flash media, magnetic tape storage, regular optical discs all are subject to deterioration and have short life spans. With additional storage servers, the amount of data that can be accessed in unlimited.

The system can preserve data for as long as necessary and access it whenever needed. Benefits provide lower operating costs through lower media migration costs, wider environmental storage requirements, migration-free technology upgrades and high media longevity and durability.

"The cost savings is stark while the possibility of data loss is virtually eliminated."

Wednesday, January 21, 2015

How one of the world’s largest archives is managing the move from parchment to pixels

How one of the world’s largest archives is managing the move from parchment to pixels. David Clipsham. Blog. January 16 2015.
The UK National Archives is to permanently preserve the records of the UK government that have been selected for their historic value. Because there was no authoritative source of information regarding file formats they developed PRONOM, a registry of file formats and the applications required to open and read them, and DROID, a freely available open source tool to manage that data and information.There the approach to digital preservation, which they call parsimonious preservation, is essentially two principles:
  1. Understand what you have got
  2. Keep it safe
In addition, they have built an infrastructure with programs for virus scanning, file fixity checking to ensure that any digital object received has not been altered or corrupted, identifying the file type, and recording of metadata.

Creating and Archiving Born Digital Video

Creating and Archiving Born Digital Video. Library of Congress. December 2, 2014.
Four PDF documents from the Library of Congress / The FADGI Audio-Visual Working Group. They provide practical technical information for both file creators and file archivists to help them make informed decisions when creating or archiving born digital video files and to understand the long term consequences of those decisions.
  •  Part 1. Introduction. Explanatory document.
    These recommended practices are intended to support informed decision-making and guide file creators and archivists as they seek out processes, file characteristics, and other practices that will yield files with the greatest preservation potential.
    The documents and case histories show that there is no one answer to the question “what format should I use to ensure sustainable long term access for my born digital video files?” Instead, there is "a range of solutions based on the fitness for purpose concept where the workflows and deliverables achieve the specific goals set out for the project within the existing constraints and circumstances."
  • Part 2. Eight Federal Case Histories. This report presents eight case histories documenting the current state of practice in six federal agencies working with born digital video, divided into 3 creating cases, and three archiving cases.The goal of the three Creating case histories is to encourage a thoughtful approach from the very beginning of the video production project,  which takes sustainability and interoperability into account. The three Archiving case histories show issues of moving the files into repositories, and explore the issues of long term retention and access. The report contains recommended practices, requirements, advice, examples of when following recommended practices is not practical, costs, and lessons learned. At the end are helpful File Characteristic Comparison Tables summarizing the specifications of the creating and archiving case histories, both video and audio data.
  • Part 3. High Level Recommended Practice. This document outlines a set of high level recommended practices for creating and archiving born digital video, with advice for file creators, archivists, and advice for both that transcend life cycle points.Some important general points:
    • Born digital video files should be the highest quality that the institution can afford to make and maintain over the long term.
    • Project planning should include capabilities to create high quality digital video files and metadata from the outset
    • One of the most important functions of archival repositories is to document their holdings.
    • Identify the file characteristics at the most granular level possible, including the wrapper and video stream encoding 
    • It's essential in an archival environment to understand why changes to the technical characteristics of the file are needed and the impacts of these changes on the data.
    • Equally as important is to document all the changes to order to document provenance.
    • Create metadata to support life cycle management
    • Plan for access: high quality born digital video files may need additional processing to be made widely available
  • Part 4. Resource Guide. This document includes links to resources including those referred to in the case histories and recommended practices. Contains an excellent resource list to websites, documents, white papers, tools; they cover the areas of storage; transcoding / editing and other technical tools; inventorying and processing; digitizing, capture, preservation & quality control; authenticity, fixity & integrity; file naming; metadata; formats; standards; video creation; equipment and capture devices. 
[The report mentions the program ImgBurn for creating ISO images from files. Be aware the install files include OpenCandy (considered malware/ nuisance ware) which will be flagged by Symantec, AVG, and other anti-virus programs.]

Monday, January 19, 2015

Ensuring long-term access: PDF validation with JHOVE?

Ensuring long-term access: PDF validation with JHOVE? Yvonne Friese. ZBW - Leibniz Information Centre for Economics.  PDF Association. December 17, 2014.
JHOVE is an open source tool for identifying, characterizing and validating twelve common formats such as pdf, tiff, jpeg, aiff and wave.  Pages within a PDF file are usually stored as a page tree, allowing the user to reach a given page as quickly as possible. Common advice for long-term archiving is to preferentially use the PDF/A format. However, this no longer matches to the day-to-day reality of many workflows which use JHOVE for validation tests. The differences between PDF and PDF/A means that there there can be validation errors. JHOVE’s PDF module is certainly capable of validating PDF/A files but the feature does not work well.  The process does not analyze the content of the data streams, meaning that it cannot validate PDF/A compliance in line with ISO standards. JHOVE is not suited to PDF/A validation but there currently are no alternatives to JHOVE for validating standard PDFs.

JHOVE can still be useful, provided users understand its error reports and are aware of ways to resolve them. Even with the problems JHOVE remains an excellent option for providing initial guidance.

[In our own institution, we have found JHOVE to be useful in identifying PDF files that have potential problems. Each problem for each source needs to be examined to decide if there is a preservation risk.]

Digital Audio Preservation at MIT: an NDSR Project Update.

Digital Audio Preservation at MIT: an NDSR Project Update. Susan Manus, Tricia Patterson. Library of Congress; The Signal. January 16, 2015.
Report of the residency position, in which Tricia is primarily tasked with: completing a gap analysis of the digital preservation workflows currently in place for audio streaming and preservation, and developing lower-level diagrammatic and narrative workflows. [Workflow images are in the article.] Workflow documentation is receiving increased acknowledgement and appreciation in the preservation environment. The reasons:
  • tested, repeatable road map allows processing of larger projects with efficiency and security
  • detailed workflows show redundancies and deficiencies in processes across departments
  • workflow documents clarify roles and accountability within the chain of custody.
The benefits include getting a better idea of what digitization project documentation is generated and that the documentation needs to be preserved as well. It has also helped identify steps that would benefit from automation.The process started with itemizing 50-60 delivery requirements, including relevant TRAC requirements (PDF), covering display and interface, search and discovery, accessibility, ingest and export, metadata, content management, permissions, documentation and other considerations. From there requirements were prioritized on a scale from “might be nice” to “must-have.” The next step is to measure options against our prioritized requirements to determine the needs of the Libraries now. An important part is to provide meaningful access to the audio treasures in the library.

Tuesday, January 13, 2015

Preserving Write-Once DVDs

Preserving Write-Once DVDs: Producing Disk Images, Extracting Content, and Addressing Flaws and Errors.  (PDF). An Analytic Report by George Blood Audio Video Film for the Library of Congress. April 2014.
Report on technical issues in reformatting projects for the Library of Congress with an overview of the range and extent of the issues.
  • Most specialists agree that optical disc media, although inexpensive and easy to use, does not support long-term data management.
  • The estimated shelf life of a CD-R or CD-RW is between five and ten years.
  • The lifespan can be lengthened or shortened by environmental and technological factors.
  • Of the 500 discs reformatted, 10% were problematic
  • The report provides brief reviews of all of the tools used to clone discs.
  • Shows the Structure of a VIDEO_TS folder

Thursday, January 08, 2015

GPO Prepares To Become First Federal Agency Named As Trustworthy Digital Repository For Government Information

GPO Prepares To Become First Federal Agency Named As Trustworthy Digital Repository For Government Information. U.S. Government Publishing Office. Press Release. December 18, 2014.
The GPO is preparing to become the first Federal agency to be named as a Trustworthy Digital Repository for Government information through certification under ISO 16363, which defines a recommended practice for assessing the trustworthiness of digital repositories. The Audit and Certification checklist will be used by an accredited outside organization. This would be the first Federal agency to be certified.

To begin the audit process, GPO will be one of 5 institutions to receive a resident through the National Digital Stewardship Residency program to work for one year on preparation for the audit and certification of FDsys as an ISO 16363 Trustworthy Digital Repository.

The GPO has also recently changed its name to the Government Publishing Office.

Tuesday, January 06, 2015

SanDisk vows: We'll have a 16TB SSD WHOPPER by 2016.

SanDisk vows: We'll have a 16TB SSD WHOPPER by 2016. Chris Mellor. The Register.
SanDisk plans to have a 16 TB WORM (Write Once Read Many) flash storage. It could have a use case for archived content. There have been problems with flash storage, but the archived data is not rewritten and the technology's write endurance limitations won’t matter in that case.

Report Available for the 2014 DPOE Training Needs Assessment Survey

Report Available for the 2014 DPOE Training Needs Assessment Survey. Barrie Howard, Susan Manus. The Signal. Library of Congress. January 6, 2015.
An executive summary (PDF) and full report (PDF) of the survey results are now available. The survey was an effort to get a sense of the state of digital preservation practice and understand more about what capacity exists for organizations and professionals to effectively preserve digital content.
The most significant takeaways are:

  1. an overwhelming expression of concern that respondents ensure their digital content is accessible for 10 or more years (84%), 
  2. evidence of a strong commitment to support employee training opportunities (83%). 
  3. a substantial increase across all organizations in paid full-time or part-time professional staff with practitioner experience (13%)
  4. an increased number of staffing for digital preservation (46% FTE, 51% various staff)
  5. increase in organizations providing financial support for training (82%)
The type of digital content held by each institution:
  1. reformatted material digitized from collections already held (83%), 
  2. born-digital content created by and for your organization trails close behind (76.4%). 
  3. deposited digital materials managed for other individuals or institutions (45%). 
  1. online delivery is trending upward across many sectors to meet the constraints of reduced travel and professional development budgets.
  2. The survey shows that small, in-person workshops is the most preferred training option, followed by webinars, and self-paced, online courses as the next two choices.
  3. Respondents identified a clear need for technical training to assist staff in understanding and applying specific digital preservation techniques in their daily work followed by training focused on strategic planning, management and administration, project management, and fundamentals.

TIMBUS Project Web Portal: a Gateway to the TIMBUS Tools

TIMBUS Project Web Portal: a Gateway to the TIMBUS Tools. Timbus Project website. December 19, 2014.

The EU-cofunded TIMBUS project focuses on resilient business processes making the data accessible over long periods. Continued accessibility is often considered as a set of activities carried out in the isolation of a single domain. TIMBUS, however, considers the dependencies on third-party services, information and capabilities that will be necessary to validate digital information in a future usage context. TIMBUS will deliver activities, processes and tools that ensure
  • continued access to services and software
  • to produce the context within which information can be accessed, properly rendered, validated and transformed into knowledge.
This approach extends traditional digital preservation approaches by introducing the need to analyse and sustain accessibility to business processes and the supporting services, and it aligns preservation actions more fully with enterprise risk management (ERM) and business continuity management (BCM).  The complexity and scale of enterprise business processes makes TIMBUS exceptionally relevant.

This website is a gateway to the outputs of the 4-year TIMBUS Project. It focuses on materials that bridge the gap between the complex research carried out by organizations and industries looking to implement direct, usable approaches to the digital preservation of their business processes. The site contains:
  • Tools to collect relevant information from software and systems to generate a picture of a whole network of processes.
  • Legalities Lifecycle Management tools and training
  •  Risk assessment tools and recommendations about collected process data 
  • Digital Preservation Expert Suite which includes tools to gather risk assessment  to provide Digital Preservation as a possible solution.


Friday, January 02, 2015

The State Library of NSW selects Ex Libris and Axiell Solutions

The State Library of NSW selects Ex Libris and Axiell Solutions. Press release. 4 December 2014.
The State Library has announced it will adopt a suite of Ex Libris solutions which includes Alma, a  library management solution, the Primo discovery and delivery solution, and the Rosetta digital asset management and preservation system. Also the Adlib Archive software to manage its archival collections. These solutions replace multiple legacy systems used by the Library. "Rosetta’s comprehensive management and preservation capabilities for digital and digitized collections will ensure that the vast collections held in trust for the people of NSW will be preserved and available in the future."