Friday, February 23, 2007

Weekly readings - 23 February 2007

Google Study Examines Effects on PC Hard Drives. Mark Hachman. ExtremeTech. February 20, 2007.

Disk drives are generally reliable, but a study shows that current methods of predicting hard drive failure are almost ineffective, while basic disk checks can show if a drive is about to fail. The study looked at over 100,000 drives from different manufacturers over a 5 year period. There was no clear pattern to show that higher temperatures, higher utilization or activity levels affected the failure rate. Lower temperatures and very high temperatures had more failures. Scan errors and reallocation checks were a better indication that a drive would fail. If there is even one scan error, there is a significantly higher rate of failure within 60 days. Drive failure is important to deal with, as over 90% of all new information is stored on hard drives and other magnetic media.

MP3's Loss, Open Source's Gain. Eliot Van Buskirk. Wired. February 23, 2007.

Alcatel-Lucent was awarded $1.52 billion by a federal jury in an MP3 patent infringement suit against Microsoft, even though they licensed the software from Fraunhofer/Thomson, the industry-recognized licensee of MP3. The result will be appealed. But if upheld, it could start an all-out licensing / lawsuit campaign. It could possibly extend to all companies involved with MP3 encoding or playback. This uncertainty could move the industry away from MP3, and that could be beneficial for open source software or other formats. Some of these are:

  • Ogg Vorbis, (open-source with better sound quality, but royalty questions)
  • AAC (based on MPEG-4, it has greater fidelity at higher compression rates)
  • Window’s Media format

But the patent questions could extend to other software as well, which could influence the use of open source software. [See James Hilton’s speech at OR2007]

Why organizations need to archive email. White paper. GFI Website. February 22, 2007.

Emails have become the electronic substitutes of legal business documentation and the correspondence constitutes a record which has a retention period. A ‘true’ email archiving system will automatically extract and index the content of the message and attachments from emails, stores the email in read-only format so it cannot be changed. This also decreases the online email storage. Backups and archives are not the same; backups are to guard against system failure, while archives protect the data so it can be accessed when needed and can restrict access to authorized users. Email archives are important for compliance issues, litigation support, and storage / knowledge management. An email archive should have:

  • Minimal user intervention and automatic processing
  • Ability to index, search, and retrieve records and attachments
  • Data retention selection and control by policies
  • Security and authenticity which must include ability to restrict access
  • End-user and management access to archives
  • Support for multiple messaging platforms

Medieval Stained Glass in Great Britain. AHDS Website. February 22, 2007.

A major digitization project has added over 18,000 images of stained glass windows in Britain to the Arts and Humanities Data Service (AHDS) website. The site now contains about 80,000 digital images. AHDS has a number of projects that involve digital preservation. One project is POSSE (Preserve Our Student Shows for Eternity) which is to provide long term access to student degree shows. Other resources available include: The Guides to Good Practice which provides recommendations for creating and managing digital resources. AHDS has conducted a Digital Images Archiving Study with JISC. Their deposit forms and guidelines are online, along with other preservation resources which are also available.

Friday, February 16, 2007

Weekly readings - 16 February 2007

Digital Preservation in a National Context: Questions and Views of an Outsider. H.M. Gladney. D-Lib Magazine. January/February 2007.

“A solution is known in principle for every difficult technical problem of digital preservation.” Non-technical preservation challenges are greater than the technical challenges. Preservation is only a small part of “archiving”. Some information may disappear, including some that was supposed to be permanent. “Curators need to learn to live with not knowing for sure that they have succeeded.” A preservation solution would incorporate methods for:

  • Ensuring that each saved bit-string survives as long as somebody might want it;
  • Ensuring that readers can find and use any preserved object as its producers intended;
  • Providing evidence with which readers can judge information authenticity;
  • Integrating preservation support seamlessly with current information services; and
  • Hiding technical complexity from end users.

There needs to be a "trustworthy digital object" with metadata about the object and its relationship to other objects. Union catalogs of preserved objects could

Texas, Minnesota eye move to ODF. Elizabeth Montalbano. Computerworld. February 07, 2007.

Legislative action is being considered to mandate that government documents use an open, interoperable, XML-based file format. The Minnesota bill also specifies that it must be “fully published and available royalty-free; implemented by multiple vendors; and controlled by an open industry organization”. This would take affect in 2008. The Open Document Format is the intended standard. The Microsoft Open XML format, if approved by ISO, may also fit the description. MS has also created software called the ODF Translator, which translates between Open XML and ODF. Another take at: Latest OOX-ODF FUD-Spat: States Prepare to Ban Zip and PDF Files.

U.S. House Votes to Rescind NDIIPP Funding; Bill Now Under Consideration by Senate. Peter Murray. The Disruptive Library Technology Jester. February 11, 2007.

House of Representatives Resolution 20 rescinds a number of items for the Library of congress, especially “the unobligated balances available for the National Digital Information Infrastructure and Preservation Program, $47,000,000.” This will remove the funding for this program for the rest of the year.

Arts and Humanities Get Small Increases. Lauren Smith. The Chronicle of Higher Education. February 16, 2007.

In the proposed 2008 fiscal year budget, the Electronic Records Archives and the Institute of Museum and Library Services would each get a 3% increase, but the National Archives and Records Administration would receive a 7% decrease.

64 DVDs on a disc: holographic storage to ship. Lucas Mearian. Computerworld. February 12, 2007.

InPhase Technologies will begin shipping the industry's first holographic disc drive in July. The holographic disc will hold 300GB of uncompressed data and 300GB of error correction and data redundancy. It is a write once disc intended for the archival market and has a 50 year expected lifespan. To a server, the disc will look like a drive with drag and drop capability. The holographic drive will cost $18,000 and the discs will cost $180 each. The company expects to have a rewritable disc in 2008, and a 1.6TB disc by 2010. It also plans to have a holographic jukebox in 2008 with a capacity of 675TB.

Friday, February 09, 2007

Weekly readings - 9 February 2007

Digital Curation for Science, Digital Libraries, and Individuals. Neil Beagrie. The International Journal of Digital Curation. Autumn 2006.

Digital Curation is becoming a more common term. It refers to the actions to maintain digital materials for their entire life-cycle. The term, along with ‘digital preservation’ and ‘digital archiving’, is still evolving. Terminology has different meanings to different people. The terms all mean that we are developing a different approach to creating and managing digital materials. One comment defines them this way: “these are terms of increasing specificity in this context: preservation is an aspect of archiving, and archiving is an activity needed for curation. All three are concerned with managing change over time.”

Another definition is: “Digital curation, broadly interpreted, is about maintaining and adding value to, a trusted body of digital information for current and future use.” While not all digital information has long term value, a significant part of it will, though it will vary by area. Therefore, “curation and long-term preservation of digital resources could be of increasing importance for a wide range of activities.” Digital curation has implications for many different areas. The data must be continuously updated. “Significant effort needs to be put into developing persistent information infrastructures for digital materials” and for researchers and information professionals to develop the needed curation skills. Without this, the digital information will only have short term benefits.

A Vision for FEDORA’s Future, an Implementation Plan to Get There, and a Project Update. Peter Murray. Disruptive Library Technology Jester website. 24 Jan 2007.

This is a review of an update on Fedora at the Open Repositories 2007 conference. Many different kinds of projects will be using Fedora for

  • repository services: managing, accessing, versioning, and storing digital materials
  • preservation services: integrity checking, monitoring, alerting, migrating, and replicating
  • process management services: workflow and messaging applications
  • collaboration services: annotating, discussing, and rating digital objects

The Fedora project is evolving into an organization called Fedora Commons. It will be a non-profit organization, to allow users to better collaborate on projects and use the information better. This will be a multi-year effort to build the organization to be more responsive to user needs and create a more robust product.

Open Source for Open Repositories — New Models for Software Development and Sustainability. Peter Murray. Disruptive Library Technology Jester website. 24 Jan 2007.

This is a summary of an excellent presentation at the Open Repositories 2007 conference by James Hilton. Organizations may be more willing to turn to open source software in a systematic way because of:

  • Fear. Business decisions by vendors lessen the comfort of buying a software application.
  • Disillusionment. Software seems to bring an endless upgrade cycle and the institutions still need to build in the support structure.
  • Incredulity. Software is disruptive, expensive, and may not lead where they need.
  • Increasing collaboration. In the ‘new order’ the new competitive advantage will be picking the right collaborative partners.

There are different meanings to ‘open’; it does not always mean ‘free’, and this needs to be reviewed carefully to determine the consequences. The benefits of open source may be that you can control your own destiny, it builds community support, it separates ownership from support, and leverages the links between the institution and others. The challenges may be that “clean” code is impossible to guarantee, licenses and patents may be difficult to manage, and lawsuits may happen. Open source is more of a commitment to build. Licensing is a contract and must be maintained and understood. Communities don’t just happen; they require shared purpose, governance, discipline, and cooperation. If institutions are going to use open source, they must make commitments to it.

iSymantec software captures IM traffic. Lucas Mearian. Computerworld. January 31, 2007.

Symantec Corp. announced Veritas Backup Reporter 6.0, an enterprise backup reporting tool that gives IT administrators a single corporate view of backup and recovery operations and better able to perform capacity planning. Enterprise Vault 7.0 allows IT managers to archive and classify e-mail, instant messaging and other content either automatically, by user classification, or integrated with records management systems.

Friday, February 02, 2007

Weekly readings - 2 February 2007

Adobe to Release PDF for Industry Standardization. Press release. Jan. 29, 2007.

Adobe announced that it will release the full PDF 1.7 specification to AIIM to be published by the International Organization for Standardization (ISO). This is to help the process to make it an ISO standard. Adobe said that this is “reinforcing our commitment to openness.”

The Saga Of the Lost Space Tapes. Marc Kaufman. Washington Post. January 31, 2007.

Millions of people saw the video of the moon landing in 1969. What most don’t know is that the “camera had actually sent back video far crisper and more dramatic”, but which only a few people have seen. The high-quality tapes, which were in a highly specialized format, were stored and forgotten. Now NASA has started looking for those tapes, but after an official search through archives, record centers and storage rooms, NASA has acknowledged that the videos are lost. Everyone assumed that NASA would archive the tapes. "Maybe somebody didn't have the wisdom to realize that the original tapes might be valuable sometime in the future. Certainly, we can look back now and wonder why we didn't have better foresight about this."

Seagate drive has gigabytes of wireless, pocket storage. Ben Ames. Computerworld. January 30, 2007.

Seagate unveiled a wireless 10GB to 20GB storage device intended to fit in users' pockets and allow them to store and share digital files between mobile phones, PCs and other mobile platforms. This device called Digital Audio Video Experience (DAVE) has a 1-in. hard drive and can use Wi-Fi networking to share files with another device within 30 feet. This can be used to deliver video files without latency or coverage problems, since the files can be downloaded to the hardware at leisure instead of streamed live through mobile networks,

Opinion: Ultrasimple image backup. Steve Bass. Computerworld. January 31, 2007.

The Polaroid Media Backup Photo Edition is a 40GB external drive that can easily back up over 60 different image file types. It was designed for simplicity; once it is connected to a USB port on a computer, the device is prompted to find any images and start backing them up. It is plug and play; there is no software to install and there is no on-off switch. The 2.5-inch 40GB hard drive can hold up to about 40,000 regular-sized photos and can be used with Internet services for sharing and printing. The cost will be about $129.