Monday, January 26, 2015

ForgetIT

ForgetIT. Website. January 23, 2015. 

While preservation of digital content is now well established in memory institutions such as national libraries and archives, it is still in its infancy in most other organizations, and even more so for personal content. ForgetIT combines three new concepts to ease the adoption of preservation in the personal and organizational context: 
  1. Managed Forgetting: resource selection as a function of attention and significance dynamics. Focuses on characteristic signal reduction. It relies on information assessment and offers options such as full preservation, removing redundancy, resource condensation, and complete digital forgetting. 
  2. Synergetic Preservation: making intelligent preservation processes a part of the content life cycle and by developing solutions for smooth transitions.
  3. Contextualized Remembering: keeping preserved content meaningful by combining context extraction and re-contextualization.
The main expected outcomes are the flexible Preserve-or-Forget Framework for intelligent preservation management and, on top of it, two application pilots: one for personal preservation focusing and one for organizational preservation. It is an important step for managing and preserving new forms of community memory and cultural history. It is also an alternative to the “keep it all” approach in our digital society.




Digital Curation Foundations

Digital Curation Foundations. Stephen Abrams. California Digital Library. January 20, 2015. (PDF).
Digital curation is a complex of actors, policies, practices, and technologies that enables meaningful consumer engagement with authentic content of interest across space and time. Curation decisions should be made with respect to an underlying theory or conceptual domain model based on first principles. The ultimate goal of digital curation is to deliver content. The digital curation field has reached a stage of maturity where it can usefully draw upon a rich body of theoretical research and practical experience.

Sunday, January 25, 2015

X-ray technique reads burnt Vesuvius scroll

X-ray technique reads burnt Vesuvius scroll. Jonathan Webb. BBC News. 20 January 2015.
Scientists are using a 3D X-ray imaging technique to read rolled-up scrolls buried by Mount Vesuvius that can distinguish the ink from the paper. The technique has identified a handful of Greek letters within a rolled-up scroll. [BYU has used multi-spectral imaging to read the blackened unrolled scroll fragments. More here.] The X-ray phase-contrast tomography technique looks at the bumps on the paper rather than chemicals in the ink that yielded the long-hidden letters. The letters are slightly raised, the ink never penetrated into the fibres of the papyrus, but sat on top of them. Curved letters that stand out from the papyrus fibres are easier to identify than square ones.


Saturday, January 24, 2015

Video Games and the Curse of Retro

Video Games and the Curse of Retro. Simon Parkin. New Yorker. January 11, 2015.
 Almost two and a half thousand MS-DOS computer games have been added to the Internet Archive game collection (which says that "Through the use of the EM-DOSBOX in-browser emulator, these programs are bootable and playable.") The archive has rescued historical games which are unplayable unless you also have the original hardware.

Video games are more prone to obsolescence than other digital products. When hardware and software change, many games become unplayable. Unlike other digital media, video games rely on audiovisual reproduction and on a computer’s ability to execute the coded rules and instructions. Game publishers may not have an incentive to maintain older games, so they become obsolete.

Britain’s National Media Museum established the National Videogame Archive, which aims to “preserve, analyse and display the products of the global videogame industry by placing games in their historical, social, political and cultural contexts.” The Internet Archive, by contrast, makes games playable online. The games are part of our social, political, and cultural context. “We risk ending up in a ‘digital dark age’ because so much material that defines our current era is immaterial and ephemeral.” This is the motivation for many video-game preservationists: save everything before it’s lost, and let the future decide what matters in the long run.

Friday, January 23, 2015

The Dataverse Network

The Dataverse Network. Harvard Dataverse Network. 2014.
The Dataverse Network is an open source application to publish, share, reference, extract and analyze research data. It facilitates making data available to others and to replicate work of other researchers. The network hosts multiple studies or collections of studies, and each study contains cataloging information that describes the data plus the actual data and complementary files.

The Dataverse Network project develops software, protocols, and community connections for creating research data repositories that automate professional archival practices, guarantee long term preservation, and enable researchers to share, retain control of, and receive web visibility and formal academic citations for their data contributions.

Thursday, January 22, 2015

Fighting entropy and ISIL, one image at a time

Fighting entropy and ISIL, one image at a time. Whitney Blair Wyckoff. FedScoop. December 10, 2014.
United States security is generating so much data that traditional disk media is being pushed to its limits, requiring new technologies to safely store all that information. Hitachi Data Systems has a new technology to preserve information on disks in an infinitely expandable array. This platform uses Blu-ray XL M-DISCs that resist environmental conditions and can last for more than 1,000 years. The M-DISC optical solutions have proven survivability and durability. This system represents both "the highest reliability as well as the lowest overall cost of ownership representing superior savings in power, footprint and data reliability."
The IT can supplement magnetic storage with optical media to create a preservation tier that enables IT managers to migrate data when they want, not when the technology or media forces them.  This saves money and allows for more strategic long term planning. Flash media, magnetic tape storage, regular optical discs all are subject to deterioration and have short life spans. With additional storage servers, the amount of data that can be accessed in unlimited.

The system can preserve data for as long as necessary and access it whenever needed. Benefits provide lower operating costs through lower media migration costs, wider environmental storage requirements, migration-free technology upgrades and high media longevity and durability.

"The cost savings is stark while the possibility of data loss is virtually eliminated."






Wednesday, January 21, 2015

How one of the world’s largest archives is managing the move from parchment to pixels

How one of the world’s largest archives is managing the move from parchment to pixels. David Clipsham. Blog. January 16 2015.
The UK National Archives is to permanently preserve the records of the UK government that have been selected for their historic value. Because there was no authoritative source of information regarding file formats they developed PRONOM, a registry of file formats and the applications required to open and read them, and DROID, a freely available open source tool to manage that data and information.There the approach to digital preservation, which they call parsimonious preservation, is essentially two principles:
  1. Understand what you have got
  2. Keep it safe
In addition, they have built an infrastructure with programs for virus scanning, file fixity checking to ensure that any digital object received has not been altered or corrupted, identifying the file type, and recording of metadata.


Creating and Archiving Born Digital Video

Creating and Archiving Born Digital Video. Library of Congress. December 2, 2014.
Four PDF documents from the Library of Congress / The FADGI Audio-Visual Working Group. They provide practical technical information for both file creators and file archivists to help them make informed decisions when creating or archiving born digital video files and to understand the long term consequences of those decisions.
  •  Part 1. Introduction. Explanatory document.
    These recommended practices are intended to support informed decision-making and guide file creators and archivists as they seek out processes, file characteristics, and other practices that will yield files with the greatest preservation potential.
    The documents and case histories show that there is no one answer to the question “what format should I use to ensure sustainable long term access for my born digital video files?” Instead, there is "a range of solutions based on the fitness for purpose concept where the workflows and deliverables achieve the specific goals set out for the project within the existing constraints and circumstances."
  • Part 2. Eight Federal Case Histories. This report presents eight case histories documenting the current state of practice in six federal agencies working with born digital video, divided into 3 creating cases, and three archiving cases.The goal of the three Creating case histories is to encourage a thoughtful approach from the very beginning of the video production project,  which takes sustainability and interoperability into account. The three Archiving case histories show issues of moving the files into repositories, and explore the issues of long term retention and access. The report contains recommended practices, requirements, advice, examples of when following recommended practices is not practical, costs, and lessons learned. At the end are helpful File Characteristic Comparison Tables summarizing the specifications of the creating and archiving case histories, both video and audio data.
  • Part 3. High Level Recommended Practice. This document outlines a set of high level recommended practices for creating and archiving born digital video, with advice for file creators, archivists, and advice for both that transcend life cycle points.Some important general points:
    • Born digital video files should be the highest quality that the institution can afford to make and maintain over the long term.
    • Project planning should include capabilities to create high quality digital video files and metadata from the outset
    • One of the most important functions of archival repositories is to document their holdings.
    • Identify the file characteristics at the most granular level possible, including the wrapper and video stream encoding 
    • It's essential in an archival environment to understand why changes to the technical characteristics of the file are needed and the impacts of these changes on the data.
    • Equally as important is to document all the changes to order to document provenance.
    • Create metadata to support life cycle management
    • Plan for access: high quality born digital video files may need additional processing to be made widely available
  • Part 4. Resource Guide. This document includes links to resources including those referred to in the case histories and recommended practices. Contains an excellent resource list to websites, documents, white papers, tools; they cover the areas of storage; transcoding / editing and other technical tools; inventorying and processing; digitizing, capture, preservation & quality control; authenticity, fixity & integrity; file naming; metadata; formats; standards; video creation; equipment and capture devices. 
[The report mentions the program ImgBurn for creating ISO images from files. Be aware the install files include OpenCandy (considered malware/ nuisance ware) which will be flagged by Symantec, AVG, and other anti-virus programs.]


Monday, January 19, 2015

Ensuring long-term access: PDF validation with JHOVE?

Ensuring long-term access: PDF validation with JHOVE? Yvonne Friese. ZBW - Leibniz Information Centre for Economics.  PDF Association. December 17, 2014.
JHOVE is an open source tool for identifying, characterizing and validating twelve common formats such as pdf, tiff, jpeg, aiff and wave.  Pages within a PDF file are usually stored as a page tree, allowing the user to reach a given page as quickly as possible. Common advice for long-term archiving is to preferentially use the PDF/A format. However, this no longer matches to the day-to-day reality of many workflows which use JHOVE for validation tests. The differences between PDF and PDF/A means that there there can be validation errors. JHOVE’s PDF module is certainly capable of validating PDF/A files but the feature does not work well.  The process does not analyze the content of the data streams, meaning that it cannot validate PDF/A compliance in line with ISO standards. JHOVE is not suited to PDF/A validation but there currently are no alternatives to JHOVE for validating standard PDFs.

JHOVE can still be useful, provided users understand its error reports and are aware of ways to resolve them. Even with the problems JHOVE remains an excellent option for providing initial guidance.

[In our own institution, we have found JHOVE to be useful in identifying PDF files that have potential problems. Each problem for each source needs to be examined to decide if there is a preservation risk.]

Digital Audio Preservation at MIT: an NDSR Project Update.

Digital Audio Preservation at MIT: an NDSR Project Update. Susan Manus, Tricia Patterson. Library of Congress; The Signal. January 16, 2015.
Report of the residency position, in which Tricia is primarily tasked with: completing a gap analysis of the digital preservation workflows currently in place for audio streaming and preservation, and developing lower-level diagrammatic and narrative workflows. [Workflow images are in the article.] Workflow documentation is receiving increased acknowledgement and appreciation in the preservation environment. The reasons:
  • tested, repeatable road map allows processing of larger projects with efficiency and security
  • detailed workflows show redundancies and deficiencies in processes across departments
  • workflow documents clarify roles and accountability within the chain of custody.
The benefits include getting a better idea of what digitization project documentation is generated and that the documentation needs to be preserved as well. It has also helped identify steps that would benefit from automation.The process started with itemizing 50-60 delivery requirements, including relevant TRAC requirements (PDF), covering display and interface, search and discovery, accessibility, ingest and export, metadata, content management, permissions, documentation and other considerations. From there requirements were prioritized on a scale from “might be nice” to “must-have.” The next step is to measure options against our prioritized requirements to determine the needs of the Libraries now. An important part is to provide meaningful access to the audio treasures in the library.




Tuesday, January 13, 2015

Preserving Write-Once DVDs

Preserving Write-Once DVDs: Producing Disk Images, Extracting Content, and Addressing Flaws and Errors.  (PDF). An Analytic Report by George Blood Audio Video Film for the Library of Congress. April 2014.
Report on technical issues in reformatting projects for the Library of Congress with an overview of the range and extent of the issues.
  • Most specialists agree that optical disc media, although inexpensive and easy to use, does not support long-term data management.
  • The estimated shelf life of a CD-R or CD-RW is between five and ten years.
  • The lifespan can be lengthened or shortened by environmental and technological factors.
  • Of the 500 discs reformatted, 10% were problematic
  • The report provides brief reviews of all of the tools used to clone discs.
  • Shows the Structure of a VIDEO_TS folder
 

Thursday, January 08, 2015

GPO Prepares To Become First Federal Agency Named As Trustworthy Digital Repository For Government Information

GPO Prepares To Become First Federal Agency Named As Trustworthy Digital Repository For Government Information. U.S. Government Publishing Office. Press Release. December 18, 2014.
The GPO is preparing to become the first Federal agency to be named as a Trustworthy Digital Repository for Government information through certification under ISO 16363, which defines a recommended practice for assessing the trustworthiness of digital repositories. The Audit and Certification checklist will be used by an accredited outside organization. This would be the first Federal agency to be certified.

To begin the audit process, GPO will be one of 5 institutions to receive a resident through the National Digital Stewardship Residency program to work for one year on preparation for the audit and certification of FDsys as an ISO 16363 Trustworthy Digital Repository.

The GPO has also recently changed its name to the Government Publishing Office.


Tuesday, January 06, 2015

SanDisk vows: We'll have a 16TB SSD WHOPPER by 2016.

SanDisk vows: We'll have a 16TB SSD WHOPPER by 2016. Chris Mellor. The Register.
SanDisk plans to have a 16 TB WORM (Write Once Read Many) flash storage. It could have a use case for archived content. There have been problems with flash storage, but the archived data is not rewritten and the technology's write endurance limitations won’t matter in that case.


Report Available for the 2014 DPOE Training Needs Assessment Survey

Report Available for the 2014 DPOE Training Needs Assessment Survey. Barrie Howard, Susan Manus. The Signal. Library of Congress. January 6, 2015.
An executive summary (PDF) and full report (PDF) of the survey results are now available. The survey was an effort to get a sense of the state of digital preservation practice and understand more about what capacity exists for organizations and professionals to effectively preserve digital content.
The most significant takeaways are:

  1. an overwhelming expression of concern that respondents ensure their digital content is accessible for 10 or more years (84%), 
  2. evidence of a strong commitment to support employee training opportunities (83%). 
  3. a substantial increase across all organizations in paid full-time or part-time professional staff with practitioner experience (13%)
  4. an increased number of staffing for digital preservation (46% FTE, 51% various staff)
  5. increase in organizations providing financial support for training (82%)
The type of digital content held by each institution:
  1. reformatted material digitized from collections already held (83%), 
  2. born-digital content created by and for your organization trails close behind (76.4%). 
  3. deposited digital materials managed for other individuals or institutions (45%). 
Training:
  1. online delivery is trending upward across many sectors to meet the constraints of reduced travel and professional development budgets.
  2. The survey shows that small, in-person workshops is the most preferred training option, followed by webinars, and self-paced, online courses as the next two choices.
  3. Respondents identified a clear need for technical training to assist staff in understanding and applying specific digital preservation techniques in their daily work followed by training focused on strategic planning, management and administration, project management, and fundamentals.

TIMBUS Project Web Portal: a Gateway to the TIMBUS Tools

TIMBUS Project Web Portal: a Gateway to the TIMBUS Tools. Timbus Project website. December 19, 2014.

The EU-cofunded TIMBUS project focuses on resilient business processes making the data accessible over long periods. Continued accessibility is often considered as a set of activities carried out in the isolation of a single domain. TIMBUS, however, considers the dependencies on third-party services, information and capabilities that will be necessary to validate digital information in a future usage context. TIMBUS will deliver activities, processes and tools that ensure
  • continued access to services and software
  • to produce the context within which information can be accessed, properly rendered, validated and transformed into knowledge.
This approach extends traditional digital preservation approaches by introducing the need to analyse and sustain accessibility to business processes and the supporting services, and it aligns preservation actions more fully with enterprise risk management (ERM) and business continuity management (BCM).  The complexity and scale of enterprise business processes makes TIMBUS exceptionally relevant.

This website is a gateway to the outputs of the 4-year TIMBUS Project. It focuses on materials that bridge the gap between the complex research carried out by organizations and industries looking to implement direct, usable approaches to the digital preservation of their business processes. The site contains:
  • Tools to collect relevant information from software and systems to generate a picture of a whole network of processes.
  • Legalities Lifecycle Management tools and training
  •  Risk assessment tools and recommendations about collected process data 
  • Digital Preservation Expert Suite which includes tools to gather risk assessment  to provide Digital Preservation as a possible solution.


     

Thursday, December 18, 2014

In an All-Digital Future, It’s the New Movies That Will Be in Trouble.

In an All-Digital Future, It’s the New Movies That Will Be in Trouble. Bilge Ebiri. New York Media. December 16, 2014.
In 2007, researchers forecast that around 50 percent of the world’s movie screens would be digital by 2013. By the end of 2013, the figure was closer to 90 percent. In a short time, film has gone from an industry standard to a novelty.
Digital seemed by far the best option, but for long term preservation it has turned into "something of a catastrophe". “At this time, the longevity of digital files of moving images is anybody’s guess. We do know that it is much, much shorter than the longevity of photochemical film. If hard drives aren’t occasionally turned on, he notes, they start to become unusable."

Two famous examples of the perils of digital preservation:
  1. when the makers of Toy Story attempted to put their film out on DVD a few years after its release, they discovered that much of the original digital files of the film had been corrupted. 
  2. A similar fate came close to befalling Toy Story 2 when someone accidentally hit a “delete” button.
The irony of the digital revolution: It’s the newer movies that are in trouble. For content that was born in digital form, "all we can do is migrate the digital files as often as possible.” That requires technology and resources that go beyond what most organizations are able to handle.

The physical deterioration of drives and discs and chips isn’t the only thing digital filmmakers need to worry about. Digital files are also prone to become outdated, with software upgrades and new programs that render previous ones obsolete or unusable. Formats may be changing every 18 months to two years and may not be compatible with each other.

Part of the problem is that preservation isn’t a for-profit endeavor endeavor, so many do not want to spend a lot of money and space to preserve resources. But it becomes more important when considering the long term view.  “There’s this notion, which is not true, that digital is very inexpensive. Filmmakers and studios are saving a lot of money in production and post-production costs because of digital, and that’s a good thing. But because of that, many people don’t really understand that they’re putting their assets at risk by wholesale transferring to digital and then not keeping the originals.”

“This is not a new problem. In the 1970s and '80s, some film companies took all of their motion-picture film and transferred it to ¾-inch video, which was thought of as a preservation medium. They threw away their originals! And ¾-inch video was not a good format. In fact, it was a terrible format! This is happening with digital now. They’ve already sloughed off their nitrate collections, and there are actually discussions in some of the studios to get rid of their 35mm collections as well.”

However, film may not be as dead as some seem to think. Some archives have discussed manufacturing film themselves, if and when companies like Kodak or Agfa or Fuji go out of business. Sooner or later, there will be other strategies for the long-term preservation of digital material. “I even saw someone discussing the idea of shooting it all up into space and then waiting for it to come back around again,” he says. “That sounded like pure science-fiction, but who knows?”

Large studios are making sure that all the digital files associated with a multi-billion-dollar movie will be duplicated many times and securely placed in multiple locations. Others may not have the resources to preserve the content. Some relevant questions to ask:
  •  What will happen to them over the course of what is sure to be multiple format changes? 
  • Is somebody making sure their hard drives and the files are still usable? 
  • Have they been distributed into multiple locations? 
  • Will their producers and distributors remain solvent enough over the years to care for the content?
The story of cinema is the story of discovery. Movies once considered afterthoughts can, over time, become beloved classics. A print of a film long forgotten might turn up in a foreign archive and get revived. That may not be possible in an all-digital future, where moving-image files will need regular maintenance and upgrades to keep them viable. A forgotten movie, in other words, will be an extinct movie. 

Celluloid is far from a perfect medium, but it can survive even if some frames or reels are damaged or missing.  Not unlike with books, the simplicity of the physical medium held the key to its longevity.


Wednesday, December 17, 2014

Aligning Customer Needs: Business Process Management (BPM) and Successful Change Management in the L. Tom Perry Special Collections

Aligning Customer Needs: Business Process Management (BPM) and Successful Change Management in the L. Tom Perry Special Collections. Joseph Gordon Daines III. Library Leadership & Management. November 2014. PDF
A lot of archival processing happens before an archival or manuscript collection can be made available for research use by patrons. It is central to the archival endeavor. This article looks at the role of business process management (BPM) in automating many of the workflows used to manage the manuscript and archival collections

BPM is a field of management focused on aligning organizations with the needs and wants of their customer bases. The Special Collections department identified its customer bases as its curatorial staff and its patrons. Enabling the curatorial staff to more efficiently prepare manuscript collections for research use would also enable better customer service. Several different BPM techniques were used to gain an understanding of the curatorial needs of  the department as it automated the  workflows. This enabled the department to successfully simplify and streamline its workflows during the course of automating them. The end result has been more efficient processing of archival collections and better service for our patrons.

A review of the requirements showed a need for two types of functionality:
  1. task management, and 
  2. archival content management.
Business process:
  • Systematic management, measurement and improvement of all company processes through cross-functional teamwork and employee empowerment.
  • Standardize activities and processes in order to improve organizational efficiency
  • Business process: “a series of interrelated activities, crossing functional boundaries, with specific inputs and outputs.”
  • The tools that will be examined are process mapping, process modeling, statements of work, and use cases. Processes are modeled using at least one of the following charts: general process charts, process flow diagrams, process activity charts, or flowcharts.
  • Flowcharts are useful in identifying decision points and parallel activities in a process.gain an understanding of the sequence of activities in the process
  • Statements of Work: A specific statement regarding the requirements needed in a service contract. The statement of work should include all aspects of job requirements, performance and assessment.
  • Use cases also help identify the actors involved in various activities and what they want from those activities. For the purposes of use cases, actors are defined as “anything that interfaces with your system—for example, people, other software, hardware devices, data stores, or networks. Each actor defines a particular role.” Use cases typically include two components—a diagram featuring the actor(s) and how they interact with the system and a flow of events statement. The flow of events statement is “a series of declarative statements listing the steps of a use case from the actor’s point of view.”
  • ProcessMaker provides a SOW template that aided the project in automating the workflow comprising the department’s implementation of the archival business process.
  • The use of BPM tools and techniques in the Perry Special Collections provided the department with a methodology to examine and improve the workflow used to provide access to archival materials.
  • Business processes enable leaders to make informed decisions that can improve library’s abilities to deliver their services. 
  • BPM tools are not difficult to use and provide a wide range of benefits. Library leaders should use BPM tools to lead successful change initiatives.





Tuesday, December 16, 2014

A picture is worth a thousand (coherent) words: building a natural description of images

A picture is worth a thousand (coherent) words: building a natural description of images. Google Research Blob.
Google has developed a machine-learning system that can automatically produce captions to accurately describe images the first time it sees them. It can describe a complex scene which requires a deeper representation of what’s going on in the scene, capturing how the various objects relate to one another and translating it all into natural-sounding language. The full paper "Show and Tell: A Neural Image Caption Generator" is here.

Quantifying and Valuing the Wellbeing Impacts of Culture and Sport.

Quantifying and Valuing the Wellbeing Impacts of Culture and Sport. Daniel Fujiwara, et al. UK Department for Culture, Media & Sport. April 2014. PDF

A study to develop the evidence base on the well-being impacts of cultural engagement that provides new evidence of the link between our policies and the social impacts of engagement in culture.This presents the results of an analysis of the association between culture, sport and measures of subjective well-being.

When allocating scarce public resources, we would ideally like to know the costs and benefits of different allocating decisions.

A significant association was also found between frequent library use and reported well being. Using libraries frequently was valued at £1,359 per person per year for library users, or £113 per person per month, the third highest value.
  

Monday, December 08, 2014

Agreement Elements for Outsourcing Transfer of Born Digital Content.

Agreement Elements for Outsourcing Transfer of Born Digital Content. Ricky Erway, Ben Goldman and Matthew McKinley. Dublin, Ohio: OCLC Research. 2014. [PDF]
The article Swatting the Long Tail of Digital Media: A Call for Collaboration (2012) held that few institutions would be able to have the hardware, software, and expertise to be able to read all digital media types. A group of archival practitioners started a pilot project to test outsourcing of the transfer of content from physical media they couldn’t read in-house. They realized the need for agreements between repositories and service providers to spell out the terms of such collaboration. The group began compiling a list of elements that should be considered when creating these agreements.

This article suggests elements to consider when creating an agreement for outsourcing the transfer of born-digital content from a physical medium, while encouraging adherence to both archival principles and technical requirements. The main areas are:
  1. General Provisions: desired outcome, description of work, responsibilities and liabilities
  2. Information Supplied by Service Provider: handling instructions
  3. Information Supplied by Client: content, inventory,
  4. Statement of Work: processing, exceptions, documentation, delivery, acceptance
  5. Cost and Liability: schedule of costs and charges, responsibilities of each party
The parties should agree upon a clear set of requirements regarding the services that the Service Provider is to provide. 




Wednesday, December 03, 2014

Want a 100TB disk drive? You'll have to wait 'til 2025

Want a 100TB disk drive? You'll have to wait 'til 2025. . Computerworld. Nov 25, 2014.

An industry consortium released a road map showing that new recording technologies could yield 100TB hard drives in about 10 years.

As disk drive densities increase, the potential for data errors also increases due to a phenomenon known as superparamagnetism, where the magnetic pull between bits on a platter's surface can randomly flip them, thus changing their value from one to zero or zero to one. "Thus higher storage capacities requires the introduction of new digital storage technology."

Tuesday, December 02, 2014

Introducing the New Forever Flash: The Best Business Model in Storage Gets Even Better

Pure Storage has introduced a new approach to storage: a plan called Forever Flash. They view it as perpetual storage [in the business sense, not in the digital preservation sense]. It is intended to help customers get off of the expensive and disruptive 3-year tech refresh and replace cycle. It is maintenance coverage that proactively protects all hardware and software on the array with replacement parts and support as needed, including SSDs, for as long as a customer remains on maintenance and support.

- 11.20.2014

Monday, November 24, 2014

Curation Costs Exchange: Supporting Smarter Investments in Digital Curation

Curation Costs Exchange: Supporting Smarter Investments in Digital Curation. Sarah Middleton. Educause Review Online. November 10, 2014.

Tools to manage and estimate costs have not been integrated into other digital curation processes or tools. To determine why that is so a consortium of 13 European cost modeling specialists launched the Collaboration to Clarify the Costs of Curation (4C) project.

4C seeks to help organizations better understand the costs and benefits of digital curation and preservation, and to help users draw together existing and useful resources so they can both make their own assessment of existing models and develop their own cost modeling exercises. The Curation Costs Exchange (CCEx), a platform for the exchange and comparison of digital curation costs and cost information, is a key 4C project deliverable developed to support these goals.

The Cost Comparison Tool enables the exchange of sensitive data and gives users the opportunity to identify greater efficiencies, better practices, and valuable information exchanges among peers. There is also the Understand Your Costs toolkit. The Economic Sustainability Reference model highlights key digital curation concepts, relationships, and decision points in a complex problem space, helping users benchmark and compare their own local models.

Are libraries sustainable in a world of free, networked,digital information?


Are libraries sustainable in a world of free, networked,digital information? Lluís Anglada. El profesional de la información. 7 November 2014. [PDF]

Interesting article looking at libraries through the stages of modernization, automation and digitization, and at a formula for evaluating the importance of libraries to society. The article concludes that "if the current generation of librarians does not introduce radical changes in the role of libraries, their future is seriously threatened."

The formula proposed is the sustainability is equal to the value divided by the cost, and the value is the use minus the dysfunctions and modified by the perceptions of the library "S= (U - D + 2P) / C".

Libraries are changing because of technology and needs, but there is a danger that people will perceive them as unable to provide the information that users demand. If this continues, those funding the libraries will provide less support. The perceptions must change in order for libraries to be sustainable.
Some thoughts from the article:
  • Libraries are changing from being a space to store, locate and use books to places where people interact and socialize. This should transform the perception that citizens have of their libraries, seeing them as places to ‘change lives by giving people the tools they need to succeed’.
  • Libraries depend on public funding, and their future depends on the perception or mental image of libraries held by administrators and policy makers who allocate budgets  
  • Libraries used to show statistical data on resources; they must now show their value to those who support them financially 
  • The emergence of new roles for libraries does not mean that all library services have evolved over time. In the new environment, some traditional strengths of libraries are weakening.
  • Library catalogues and automated systems were innovative in the ’80s, but have been stuck in outmoded practices. Users have adapted quickly to the ‘googlization’ of information and do not understand why they should have to look in different places to get a unique solution to an information need. 
  • Two key elements for future library sustainability: perception and adaptation to a new paradigm
  • The perception of libraries remains increasingly attached to the printed book, from 69% of Americans in 2005, to 75% in 2010.
  • Libraries may end up being seen as useful only to preserve the past (i.e. the printed book), and consequently of little use to handle digital information.
  • The library has been steadily declining in importance in university budgets.
  • People sustain libraries because of a positive perception and a feeling that the libraries are important. We believe that society still needs the functions performed by libraries and librarians, but the feeling alone does not make them immediately sustainable. 
  • We must soon establish a new stereotype of ‘library’ in people’s minds, one that is not based on the physicality of the buildings or books, but focuses on the role of support and assistance in the difficult process of using information and transforming it into knowledge. 
  • The creation of perceptions of a library and librarian that are associated with assistance regarding information is a contribution that has not yet been made. 
  • This is the challenge and responsibility for young librarians: to create a new perception of our profession. We must establish a new stereotype of ‘library’ in peoples’ minds, one that
    is not based on the physicality of the buildings and books, but on the role of support and assistance in the difficult process of using information and transforming it into knowledge.


BYU professor leads the way in digitizing Victorian era literature.

BYU professor leads the way in digitizing Victorian era literature. Aaron Butler, Jaren Wilkey. BYU News Release. November 20, 2014.
The Victorian Short Fiction Project is a research venture to get students more involved in exploring the Victorian  literature in BYU's special collections library. The project wiki has nearly 200 transcribed stories in an online repository, viewed more than 150,000 times.
  • “I wanted [the students] to experience the sense of discovery that comes from archival research and to sample literature beyond their anthology,”
  • “The purpose of the project is the students. We are training the next generation of digital humanists — people who are trained in the humanities but see the potential of digital technology. The students’ electronic texts reach far beyond the classroom and will reside in a public space after the semester ends. One of the most important legacies we can pass on to our students is an understanding and appreciation of the strengths of both material and electronic texts. They will need to be stewards of both.”

Sunday, November 23, 2014

Digital Preservation File Format Policies of ARL Member Libraries: An Analysis

Digital Preservation File Format Policies of ARL Member Libraries: An Analysis. Kyle Rimkus et al. D-Lib Magazine. March/April 2014.

Repository managers often create a smaller set of formats to simplify management; the formats vary by institutions. Many institutions have a migration strategy to migrate digital objects from the great multiplicity of formats used to create digital materials to a smaller, more manageable number of standard formats that can still encode the complexity of structure and form of the original.

Open file formats are generally preferred to closed, proprietary formats because the way they encode content is transparent. On the other hand, adoption of a proprietary file format by a broad community of content creators, disseminators and users, is often considered a reliable indicator of that format's longevity. Additional qualities such as complexity, the presence of digital rights management controls, and external dependencies are also seen as relevant factors to consider when assessing file formats for preservation. There is, however, no failsafe formula for file format policy decisions. Here are some of the formats that are mos mention in preservation policies:

The five most commonly occurring file formats in all policies:
  1. Tagged Image File Format (extension TIFF, or TIF) (115),
  2. Waveform Audio File Format (WAV) (80), 
  3. Portable Document Format (PDF) (74), 
  4. JPEG (JPG, JPEG) (70), and 
  5. Plain text document (TXT, ASC) (69). 
The five most frequently occurring file formats given High Confidence in all policies:
  1. Tagged Image File Format (TIFF, TIF) (88), 
  2. Plain text document (TXT, ASC) (52), 
  3. Portable Document Format (PDF) (49), 
  4. Waveform Audio File Format (WAV) (47), and 
  5. Extensible Markup Language (XML) (47). 
 The five most frequently occurring file formats given Medium Confidence in all policies:
  1. Quicktime (MOV, QT) (47), 
  2. Microsoft Excel (XLS) (39), 
  3. Microsoft Word (DOC) (38), 
  4. Microsoft Powerpoint (PPT) (38), and 
  5. RealAudio (RAM, RA, RM) (35).

Saturday, November 22, 2014

Five steps to decide what data to keep

Five steps to decide what data to keep. Angus Whyte. Digital Curation Centre. 31 October 2014.
 This guide aims to help UK Higher Education Institutions aid their researchers in making informed choices about what research data to keep. 

It will be relevant to researchers making decisions on a project-by-project basis, or formulating departmental guidelines. It assumes that decisions on particular datasets will normally be made by researchers with advice from the appropriate staff (e.g. academic liaison librarians) and taking into account any institutional policy on Research Data Management (RDM) and guidance available within their own domain.

Step 1. Identify purposes that the data could fulfill
Step 2. Identify data that must be kept
Step 3. Identify data that should be kept
Step 4. Weigh up the costs
Step 5. Complete the data appraisal 

The final step is to weigh the value of the data and any costs still to be incurred, "considering the long-terms aims, the qualities you identified, the time and money already invested in it and the risks of being unable to prepare any ‘must keep’ data for preservation."



Angus Whyte, Published: 31 October 2014
Angus Whyte, Published: 31 October 2014
Angus Whyte, Published: 31 October 2014
Angus Whyte, Published: 31 October 2014
Angus Whyte, Published: 31 October 2014

Friday, November 14, 2014

Guidelines for the creation of an institutional policy on digital preservation.

Guidelines for the creation of an institutional policy on digital preservation. Nestor. November 2014. [PDF].

nestor (Network of Expertise in long-term STORage and accessibility of digital resources in Germany) has just translated its guidelines on institutional preservation policies into English. The guideline provides digital archives with assistance in creating their own institutional policy on digital preservation. It address the questions:
  1. What is the purpose of a policy?
  2. What must a policy cover?
  3. How is a policy produced?
It also addresses policies in cooperative long term preservation and gives a generic example of a institutional policy. Some items of note:
  • the publication of institutional preservation policies has emerged as a good way to increase transparency. A policy document helps an institution to understand the challenges and to commit to a task.
  • It sets out lastingly effective basic strategic and organisational elements of a digital archive and helps to increase confidence overall. In this way policies help to preserve the digital information of yesterday and today in a reliable manner and to safeguard it for tomorrow’s users.
  • Digital preservation is not an end in itself; it is always aimed at a "designated community". 
  • A digital archive needs a systematically developed and generally complex technical infrastructure.
  • The construction of the technical infrastructure is thus dependent on the overall strategic and tactical planning of the institution as a whole, which ought to remain stable and as independent as possible from the rapid technological changes in the digital world. 
  • The establishment of preservation policies can, under both scenarios, make a significant contribution to clarity in relation to the areas for joint action, differences, opportunities and risks that can be created.
This is an excellent resource on writing digital preservation policies.


POWRR Tool Grid

POWRR Tool Grid. COPTR Consortium. November 2014.

The Digital POWRR Project has produced version 2 of the Digital POWRR Tools Grid. The Grid, which helps practitioners find software tools to solve their digital preservation challenges, provides information about almost 400 digital preservation tools.The Tools Grid can also be found on a new domain for community owned digital preservation resources: Digipres Commons.

Digipres Commons highlights useful collaborative preservation resources from around the web as well as hosting these other collaborative services:
  • The COPTR tools registry
  • The Digital Preservation Question and Answer site
  • The File Formats aggregation service
The main topics of tools, subdivided by material or format, are:
  •  Access, Use and Reuse 
  •  Create or Receive (Acquire) 
  •  Cross-Lifecycle Functions 
  •  Dispose 
  •  Ingest 
  •  Preservation Action 
  •  Preservation Planning 
  •  Store 

Seagate preps for 30TB laser-assisted hard drives

Seagate preps for 30TB laser-assisted hard drives. Lucas Mearian. Computerworld.
Seagate Technology is boosting investments in laser-assisted hard disk drive which it projects could  theoretically increase disk capacity to 30TB by 2016 - 2020.

Wednesday, November 05, 2014

Maturity levels & Preservation Policies

Maturity levels & Preservation Policies.
Report of a presentation given at the iPRES 2014 conference in Melbourne on the SCAPE Preservation Policies.  The presentation explained  the SCAPE  Preservation Policy Model and also  summarized / analysed the findings of 40 actual preservation policies. Organisations often overstretch themselves in formulating preservation policies that are not in line with their maturity (based on the  Maturity Model.

Sunday, November 02, 2014

ARMA 2014: The Convergence of Records Management and Digital Preservation

ARMA 2014: The Convergence of Records Management and Digital Preservation. Howard Loos, Chris Erickson. October 2014. [PDF]
Presentation on records management and digital preservation given at the ARMA 2014 conference.
Notes:
  • Records Management mission: To assist departments in fulfilling their responsibility to identify and manage records and information in accordance with legal, regulatory, and operational requirements
  • RIM Life Cycle to DP Life Cycle
  • Challenges and successful approaches
  • Storing records permanently with M-Discs
  • Introduction to Digital Preservation, challenges, format sustainability, media obsolescence, metadata, organizational challenges,
  • Life of digital media
  • Best practices and processes
  • OAIS model
  • Rosetta Digital Preservation System
  • Library of Congress Digital Preservation Outreach & Education (DPOE) Network

Why Netflix sends 'Orange is the New Black' to the Library of Congress on videotape

Why Netflix sends 'Orange is the New Black' to the Library of Congress on videotape. And why the library hopes that's going to change. Adi Robertson. The Verge. October 29, 2014.
After companies shut down and collectors lose interest, the Library of Congress is supposed to keep our cultural history intact. But digital media has turned our understanding of preservation on its head.
Artists regularly register their work with the US Copyright Office and as part of the process, they send a copy, in some cases a physical copy to the registrars which is then stored by the Library of Congress. The physical copies aren’t the final storage method, just a way to get the file to the library, which then uploads them to its database. Delivering digital files on potentially lower-quality tapes and discs instead of transmitting them directly is an awkward stopgap. A pilot program is in process to allow studios to transfer files directly to the Library of Congress and the US Copyright Office.

Thursday, October 30, 2014

2014 DPOE Training Needs Assessment Survey

2014 DPOE Training Needs Assessment Survey. Barrie Howard, Susan Manus. The Signal. Library of Congress. October 29, 2014.
The survey was an effort to get a sense of the state of digital preservation practice and understand more about what capacity exists for organizations and professionals to effectively preserve digital content.
The most significant takeaways are:
  1. an overwhelming expression of concern that respondents ensure their digital content is accessible for 10 or more years (84%), 
  2. evidence of a strong commitment to support employee training opportunities (83%). 
  3. a substantial increase across all organizations in paid full-time or part-time professional staff with practitioner experience (13%)
The type of digital content held by each institution:
  1. reformatted material digitized from collections already held (83%), 
  2. born-digital content created by and for your organization trails close behind (76.4%). 
  3. deposited digital materials managed for other individuals or institutions (45%). 

Investing in Curation: A shared path to sustainability.

Investing in Curation: A shared path to sustainability. 4C Project. October 20, 2014.

Digital curation involves managing, preserving and adding value to digital assets over their entire lifecycle. The active management of digital assets maximises their reuse potential, mitigates the risk of obsolescence and reduces the likelihood that their long-term value will diminish. However, this requires effort so there are costs associated with this activity. As the range of organisations responsible for managing and providing access to digital assets over time continues to increase, the cost of digital curation has become a significant concern for a wider range of stakeholders.

Establishing how much investment an organisation should make in its curation activities is a difficult question. If a shared path can be agreed that allows the costs and benefits of digital curation to be collectively assessed, shared and understood, a wider range of stakeholders will be able to make more efficient investments throughout the lifecycle of the digital assets in their care. With a shared vision, it will be easier to assign roles and responsibilities to maximise the return on the investment of digital curation and to clarify questions about the supply and demand of curation services. This will foster a healthier and more effective marketplace for services and solutions and will provide a more robust foundation for tackling future grand challenges.

Situating the Roadmap:  The six messages in the roadmap have been carefully considered to effect a step change in attitudes over the next five years. It starts with a focus on the costs of digital curation—but the end point and the goal is to bring about a change in the way that all organisations think about and sustainably manage their digital assets.

D5.1 - Draft Roadmap ( PDF - 2.5 MB)