Friday, March 27, 2009

Digital Preservation Matters - 27 March 2009

Farewell to the Printed Monograph. Scott Jaschik. Inside Higher Ed. March 23, 2009.

The University of Michigan Press announced it will shift its scholarly publishing from a traditional print operation to primarily digital. They expect most of their monographs to be released only in digital editions. Readers will still be able to use print-on-demand systems, but the press will consider the digital monograph the norm. They say it's time to stop trying to make the old economics of scholarly publishing work. The press expects to publish more books, and to distribute them electronically to a much broader audience. "We will certainly be able to publish books that would not have survived economic tests and we'll be able to give all of our books much broader distribution." Michigan plans to develop site licenses so that libraries could gain access to all of the press's books over the course of a year for a flat rate.

Other presses are also experimenting with the digital format. Pennsylvania State University Press publishes a few books a year in digital, open access format. All chapters are provided in PDF format, half in a format to download and print, and half in read only. Readers may pay for print-on-demand versions.

PREMIS Data Dictionary for Preservation Metadata. Sarah Higgins. DCC Watch Report. 25 March 2009.

This is a 3 page overview to the PREMIS data dictionary, “the current authoritative metadata standard for digital preservation” and a brief look at its use in an Institutional Repository.

Thomson Introduces mp3HD File Format. Press Release. March 19, 2009.

The company has introduced the new mp3HD format which “allows mathematically lossless compression of audio material while preserving backward compatibility to the mp3 standard.” The mp3HD files have additional information, that when combines with the mp3 portion of the file, can be played on an mp3HD-capable player. Standard mp3 players would play only the mp3 portion of the file. A program can create mp3HD files from stereo material in 16 bit 44.1Khz wav files. It is available on Linux and Windows.

Internet Archive to unveil massive Wayback Machine data center. Lucas Mearian. Computerworld. March 19, 2009.

The Internet Archive has a new computer that fits in a 20-foot-long outdoor metal cargo container filled with 63 server clusters with 4.5 million petabytes of storage and 1TB of memory. They have 151 billion archived web pages in addition to software, books and a moving image collection with 150,000 items and 200,000 audio clips. The Internet Archive also works with curators in about 100 libraries to help guide the Internet crawls.

Tuesday, March 24, 2009

Challenges facing Church history

R. Scott Lloyd. Church News. March 14, 2009.
Lecture presented at the Church History Symposium at BYU on "Preserving the History of the Latter-day Saints."
Mark L. Grover, a subject librarian at BYU who has spent 30 years gathering the history of the Church in Latin America, lamented that original records and documents are often in jeopardy of being destroyed by those who don't understand or appreciate their significance. There are several approaches being taken. "Some significant historical material surely has vanished, but much of it is still intact in private possession, and there is an increasingly greater probability that digital technology will improve the preservation odds."

Friday, March 20, 2009

Digital Preservation Matters - 20 March 2009

International Data curation Education Action (IDEA) Working Group: A Report from the Second Workshop of the IDEA. Carolyn Hank, Joy Davidson. D-Lib Magazine. March 2009.
This is a report of the workshops held in December, with links to programs and resources. In general the article acknowledges that curation of digital assets is a central challenge and opportunity for libraries and other data organizations. In order to meet this challenge, skilled professionals are needed who are trained “to perform, manage, and respond to a range of procedures, processes and challenges across the life-cycle of digital objects.” The presentations discuss developing a graduate-level curriculum to prepare master's students to work in the field of digital curation. Among the curricula at the institutions are: preparing faculty to research and teach in the field; data collection and management, knowledge representation, digital preservation and archiving, data standards, and policy. Collaboration between schools is important since the all recognize that no school can do it all. One item in particular: The skills, role and career structure of data scientists and curators: An assessment of current practice and future needs.

Report on the 2nd Ibero-American Conference on Electronic Publishing in the Context of Scholarly Communication (CIPECC 2008). Ana Alice Baptista. D-Lib Magazine. March 2009.
Some notes from this article:
  • IR (institutional repository) initiatives occur mostly in public universities
  • the main motivation for implementing an IR: answer specific demands and needs to digitally store the institution's scientific memory, rather than support for Open Access principles.
  • 40% of the analyzed IRs are maintained and coordinated by two or more sectors within each university
  • the databases with more than 3,000 documents are, in practice, OPACs with links to the full text versions.
  • the next step forward: provide new metrics on the impact factor (Scientometrics)

Items in this newsletter include:
  • CBS program on “Bye, Tech: Dealing with Data Rot.” Looks at obsolescence of computer hardware, software, and formats. “So the basic lesson is: Look after your own data and make sure that you take steps to keep it moving onto new formats about once every ten years." There are links where you can both read and watch the program. Their conclusions:
1. You should convert whatever you can afford to digital.
2. Store your tapes and films in a cool, dry place.
3. And above all, remain vigilant. As you now know, every ten years or so, you're going to have to transfer all your important memories to whatever format is current at the time, because there never has been, and there never will be, a recording format that lasts forever.
  • Federal Agencies Collaborate on Digitization Guidelines. A working group is developing best practices for digitizing recorded sound and moving images.

Got Data? A Guide to Data Preservation in the Information Age. (Updated link-August 2015.)  Francine Berman. Communications of the ACM. December 2008.
Digital data is fragile, even though we all assume it will be there when we want it. “The management, organization, access, and preservation of digital data is arguably a "grand challenge" of the information age.” This article looks at the key trends and issues with preservation:
  1. More digital data is being created than there is storage to host it.
  2. Increasingly more policies and regulations require the access, stewardship, and/or preservation of digital data.
  3. Storage costs for digital data are decreasing (but other areas are increasing).
  4. Increasing commercialization of digital data storage and services.
These four trends point to the need to take a comprehensive and coordinated approach to data cyber infrastructure. The greatest challenge in this is to develop a economically sustainable model. One approach is to create a data pyramid to the stewardship options. This shows that multiple solutions for sustainable digital preservation must be devised. There is also a need for ongoing research into and development of solutions that address these technical challenges as well as the economic and social aspects of digital preservation. They add 10 guidelines:
Top 10 Guidelines for Data Stewardship
1. Make a plan.
2. Be aware of data costs and include them in your overall IT budget.
3. Associate metadata with your data.
4. Make multiple copies of valuable data. Store some off-site and in different systems;
5. Plan for the transition and cost of digital data to new storage media ahead of time.
6. Plan for transitions in data stewardship.
7. Determine the level of "trust" required when choosing how to archive data.
8. Tailor plans for preservation and access to the expected use.
9. Pay attention to security and the integrity of your data.
10. Know the regulations.

The Library of Congress has been moving into the digital world, and one way is by a scanning project with the Internet Archive that has put 25,000 books online to date. "To preserve book knowledge and book culture means preserving every word of every sentence in the right sequence of pages in the right edition, within the appropriate historical, scholarly and bibliographical context. You must respect what you scan and treat it as an organic whole, not just raw bits of slapdash data." A lot of items that have not literally seen the light of day are being downloaded. The cost is just 10 cents a page.

Friday, March 13, 2009

Digital Preservation Matters - 13 March 2009

Update from the Blue Ribbon Task Force on Sustainable Digital Preservation and Access. Sayeed Choudhury. Preservation and Archiving Special Interest Group (PASIG). November 19, 2008. [11p. pdf]

This is a presentation about the task force. The next item is the report from the task force.

There is a focus in the task force on the economic dimensions of digital preservation. Some notes from the update: “Definition of Economic Sustainability: The set of business, social, technological, and policy mechanisms that encourage the gathering of important information assets into digital preservation systems, and support the indefinite persistence of digital preservation systems, enabling access to and use of the information assets into the long-term future.”

Economically sustainable digital preservation requires:

  • Recognition of the benefits of preservation
  • A process for selecting long term digital materials
  • Ongoing, efficient allocation of resources to digital preservation activities;
  • Organization and governance of digital preservation activities

The task force website:

Blue Ribbon Task Force Interim Report: Sustaining the Digital Investment: Issues and Challenges of Economically Sustainable Digital Preservation. December 2008. [78p. pdf]

Digital information is fundamental to modern society but there is no agreement on who is responsible or who should pay for access to and preservation of the information. Creating sustainable economic models for digital access and preservation is a focus of this group. They are looking at the current and best practices and to find or create useful models. This is an urgent task. Access to data in the future requires actions today. “Institutional, enterprise, and community decision makers must be part of the access and preservation solution.” Preserving data now is an investment in the future. Digital information has value far into the future. Sometimes we make best guesses or a hedge against the future. Decisions not made now often cost far more in the future. Without ongoing maintenance digital assets will fall into disrepair. Maintaining the assets is a problem with many sides: technical, legal, financial, and policy. This crosses all industries. There is also an opportunity cost.

Preservation is not a one-time cost; it is an ongoing commitment to a series of costs, requiring ongoing and sustaining resource allocations. Economic sustainability requires

  • Recognition of the benefits of preservation on the part of key decision-makers;
  • Incentives for decision-makers to act in the public interest;
  • A process for selecting digital materials for long-term retention;
  • Mechanisms to secure an ongoing, efficient allocation of resources
  • Appropriate organization and governance of digital preservation activities.

Decision-makers need to be aware of the value-creating opportunities from preservation. Understanding the scope of digital preservation is important. It examines the various economic models now being used; also shows a graph of Types of Information Retained the Longest. It is difficult to separate digital preservation costs from other costs. There is no substitute for a flexible, committed organization dedicated to preserving a collection of digital material. The Final Report of the task force is to be published at the end of 2009.

“Too often, digital preservation is perceived as an activity that is separable from the interests of today’s stakeholders, aimed instead at the needs of future generations. But in practice, digital preservation is very much part of the day-today process of managing digital assets in responsible ways; it is much more about ensuring that valuable digital assets can be handed off in good condition to the next succession of managers or stewards five, ten, or fifteen years down the road than it is about taking actions to benefit generations of users a hundred years hence.”

Fusion-io unveils SSD drives with 1.5GB throughput, 1.2TB capacity. Lucas Mearian. Computerworld. March 13, 2009.

Fusion-io, a Salt Lake based company, has announced a server-based solid-state drive with 1.5GB/sec. throughput. “Currently, the cards come in 160GB, 320GB and 640GB capacities. A 1.28TB card is expected out in the second half of this year.”

Preservation as a Process of a Repository. Tarrant, D. and Hitchcock, S. Sun Preservation and Archiving Special Interest Group. 18 - 21 November 2008. [pdf, ppt, pptx]

The presentation begins with different definitions of repository and Institutional Repository. Lynch defines IR as a set of services and processes, and a commitment to the digital materials created by an organization and its members. Diagrams of processes and OAIS / DCC and other models. Analysis of the preservation process. EPrints and digital preservation and repositories.

Friday, March 06, 2009

Digital Preservation Matters - 6 March 2009

DPE Digital preservation video training course. Digital Preservation Europe. February 2009.

The Digital Preservation Europe group has posted their Digital Preservation Video Training Course on the internet. These videos, from October 2008 cover topics such as

  • Introduction to Digital Preservation
  • OAIS Model and Representation Information
  • Preservation Analysis Workflow and Preservation Descriptive Information
  • Digital Preservation Preparation and Requirements
  • File Formats, Significant Properties
  • Metadata
  • Planning, infrastructure
  • Trusted Repositories

The workshop was to give the participants an understanding of digital preservation, issues, challenges, an understanding of the roles, models and file elements.

Can We Outsource the Preservation of Digital Bits? Peter Murray. DLTJ Blog. March 5, 2009.

With the increasing need large-scale digital preservation storage, The Iron Mountain storage facility may be considered. It is cloud based, and some of the preservation files do not need to be on the expensive SAN storage. A diagram of the architecture is included.

digiGO! — VIdeo Content Support From Front Porch Digital To AFN. Satnews Daily. March 02, 2009.

Front Porch Digital, which recently acquired SAMMA Systems, will install the DIVArchive product for the American Forces Network (AFN) Broadcast Center. The product line includes a semi-automated system for the migration and preservation of videotape to digital files.

iPRES 2008: Proceedings of The Fifth International Conference on Preservation of Digital Objects. British Library. March 2, 2009. [pdf]

This is the compilation of the iPres 2008 proceedings, all 319 pdf pages, which looks at tools and methods for digital preservation. This is the first full collection papers of the conference in addition to presentations of the conference. Some of these have been reviewed earlier, some will be included later.

Samsung stuffs 1.5TB onto three-platter hard drive. Lucas Mearian. Computerworld. March 5, 2009.

Samsung Electronics announced its first 500GB-per-platter hard drive. The hard disk has 1.5TB on three platters. With fewer platters and fewer moving parts, the drive should be more reliable.

Western Digital announced its 500GB per platter, 2TB capacity drive in January. The drive is 40% lower in power consumption in idle mode and 45% lower in reading/writing mode, and has a retail price of $149.

Preservation and Archiving Special Interest Group (PASIG) Fall Meeting. Paul Walk. Ariadne. January 2009.

This is a report on the Sun-PASIG November 2008. There are quite a number of presentation. Some items of interest:

Martha Anderson emphasized the need to preserve 'practice' as well as data, so that even if the technology we use changes, our decisions and thinking behind the processes are preserved. Chris Wood talked about storage: there will be an increase in the use of solid-state storage, but tape and disk will remain viable for some time to come. Tape is a viable storage medium and is still relatively cheap. Blu-Ray optical storage is also a good bet for the medium term. In the next few years, the cost of buying equipment will be less than the cost of running the equipment.