Friday, March 30, 2007

Weekly readings - 30 March 2007

Testimony to Congress. James H. Billington. Library of Congress. March 20, 2007.

Statement given by the Librarian of Congress concerning the Library of the 21st Century. It now takes 15 minutes to produce the same amount of information that it took LC over 200 years to acquire. Most exists only in digital form. “There is a widely-held but false assumption that digital materials accessible today … will necessarily be available in the future.” Also, “information not actively preserved today could literally be gone tomorrow.” Recent important digital materials, such as those on the internet, have not been preserved and have vanished. These are the primary sources of our time. A key challenge is to “capture, collect, preserve, and provide access to important ‘born-digital’ material and Web-based information.” LC manages about 295 TB of digital information. “The Library's basic mission of acquiring, preserving and making accessible the world's knowledge and the nation's creativity is not changing.” We can’t save everything, so we need to identify and select what is critical to the collection. “We are not just creating endless digital data files; we are giving our collections context and making them increasingly accessible to the world.” As we add to our collections we need an infrastructure that will make the content available in the future. A new asset is LC’s National Audiovisual Conservation Center which will preserve and make accessible the audio – visual collections.

Killing risk, unifying data protection. Jim Damoulakis. Computerworld. February 27, 2007.

It is important to look at what we are doing with data protection. Some of the techniques include nightly backup, snapshot, mirroring, database dumps, host-based replication, and storage array-based replication. One way to create a unified strategy is to look at the risks that exist. They include physical device failure, data loss through deletion or corruption, and disasters. Data loss can occur undetected, and there needs to be a way to protect against this.

Perspectives on Trustworthy Information. H.M. Gladney. Digital Document Quarterly. March 2007.

Digital preservation activities are shifting from solving basic problems to implementing solutions and repository procedures. Selection is a challenge of building a long-term digital collection, but it need to be balanced by practicalities. Archival objects need to include honest and adequate provenance information that is bound to the object. “Preserving an information collection is a different challenge than managing archives.” The need to preserve digital information, which is the base of most scientific research, is self-evident. Snapshots and logs may be sufficient for preserving databases.

JHU/UVA Medieval Manuscript Digitization Workshop. Timothy Stinson. Blog. March 28th, 2007.

This quote is from the blog report of the digitization workshop: “Staples has a great way of thinking about preservation - he pointed out that preservation isn’t simply a technological solution, an archive, e.g., where we can stick things and have them safe forever. Rather preservation is the result of usage, maintenance, and institutional commitment. Those things that are used the most, he argued, are the same ones that are migrated the most frequently, and are the least likely to become invisible and forgotten or to cease to be a priority to individuals and institutions. We need not only technical solutions, but also wide access and modeling of data in such a way that it is frequently used, migrated, and repurposed.”

Calif. CIO Steers Clear of Ideology on File Formats. Carol Sliwa. Computerworld. March 19, 2007.

The question of open formats is not an ideological struggle between competing visions of the future. It is a straight business decision, looking at the costs of one approach over another and deciding if it meets the business needs. They don’t have a preference between ODF and Office Open XML file format, but they are moving to interoperability and things that are more open and stop being locked in to proprietary systems. Open, XML-based formats provide flexibility.

