Friday, March 30, 2007

Weekly readings - 30 March 2007

Testimony to Congress. James H. Billington. Library of Congress. March 20, 2007.

Statement given by the Librarian of Congress concerning the Library of the 21st Century. It now takes 15 minutes to produce the same amount of information that it took LC over 200 years to acquire. Most exists only in digital form. “There is a widely-held but false assumption that digital materials accessible today … will necessarily be available in the future.” Also, “information not actively preserved today could literally be gone tomorrow.” Recent important digital materials, such as those on the internet, have not been preserved and have vanished. These are the primary sources of our time. A key challenge is to “capture, collect, preserve, and provide access to important ‘born-digital’ material and Web-based information.” LC manages about 295 TB of digital information. “The Library's basic mission of acquiring, preserving and making accessible the world's knowledge and the nation's creativity is not changing.” We can’t save everything, so we need to identify and select what is critical to the collection. “We are not just creating endless digital data files; we are giving our collections context and making them increasingly accessible to the world.” As we add to our collections we need an infrastructure that will make the content available in the future. A new asset is LC’s National Audiovisual Conservation Center which will preserve and make accessible the audio – visual collections.

Killing risk, unifying data protection. Jim Damoulakis. Computerworld. February 27, 2007.

It is important to look at what we are doing with data protection. Some of the techniques include nightly backup, snapshot, mirroring, database dumps, host-based replication, and storage array-based replication. One way to create a unified strategy is to look at the risks that exist. They include physical device failure, data loss through deletion or corruption, and disasters. Data loss can occur undetected, and there needs to be a way to protect against this.

Perspectives on Trustworthy Information. H.M. Gladney. Digital Document Quarterly. March 2007.

Digital preservation activities are shifting from solving basic problems to implementing solutions and repository procedures. Selection is a challenge of building a long-term digital collection, but it need to be balanced by practicalities. Archival objects need to include honest and adequate provenance information that is bound to the object. “Preserving an information collection is a different challenge than managing archives.” The need to preserve digital information, which is the base of most scientific research, is self-evident. Snapshots and logs may be sufficient for preserving databases.

JHU/UVA Medieval Manuscript Digitization Workshop. Timothy Stinson. Blog. March 28th, 2007.

This quote is from the blog report of the digitization workshop: “Staples has a great way of thinking about preservation - he pointed out that preservation isn’t simply a technological solution, an archive, e.g., where we can stick things and have them safe forever. Rather preservation is the result of usage, maintenance, and institutional commitment. Those things that are used the most, he argued, are the same ones that are migrated the most frequently, and are the least likely to become invisible and forgotten or to cease to be a priority to individuals and institutions. We need not only technical solutions, but also wide access and modeling of data in such a way that it is frequently used, migrated, and repurposed.”

Calif. CIO Steers Clear of Ideology on File Formats. Carol Sliwa. Computerworld. March 19, 2007.

The question of open formats is not an ideological struggle between competing visions of the future. It is a straight business decision, looking at the costs of one approach over another and deciding if it meets the business needs. They don’t have a preference between ODF and Office Open XML file format, but they are moving to interoperability and things that are more open and stop being locked in to proprietary systems. Open, XML-based formats provide flexibility.

Friday, March 23, 2007

Weekly readings - 23 March 2007

Trustworthy Repositories Audit & Certification (TRAC): Criteria and Checklist. Robin Dale, et al. CRL. March 9, 2007.

TRAC is the revised and expanded version of the Audit Checklist originally developed by RLG-NARA. The 94 page report provides a very complete method for checking and certifying long-term repositories. It can also be used for planning and guiding the development of repositories. The document looks at Organizational Infrastructure, Digital Object Management, and Technologies, Technical Infrastructure, & Security, and provides a checklist of criteria for measuring the trustworthiness of repositories. Another link to the site.

e-Journals: Archiving and Preservation. Briefing Paper. JISC. March 2007.

The traditional model of publishers supplying content and libraries preserving content does not work well with digital materials. Licensing agreements do not guarantee permanent access to materials. But the e-journal trend is increasing at a rapid rate. Many are searching for the solution. The terms ‘perpetual access’, ‘archiving’, and ‘long-term preservation’ are often used interchangeably. Perpetual access is usually used with e-journal licenses clauses to assure that access will be continued regardless of events. Archiving describes the management processes of e-journals. Long-term preservation refers to the processes to ensure the content remains accessible in the future, regardless of any technical or organizational changes. There needs to be multiple options and strategies for preserving e-journals, including coordinated overlap. There are promising developments evolving, but the solutions must include libraries, publishers, and archiving services.

Iron Mountain launches active archiving for email. Computer Technology Review. March 20, 2007.

Iron Mountain has introduced an Active Archiving Service for email. This is a single solution which includes management, archiving, legal discovery, continuity and disaster recovery. Most legal discovery processes now include email. The new federal rules make an email archive critical. It is integrated with Outlook and allows users full access to emails and the ability to restore individual messages. The cost starts at $6 per user per month.

Metadata for All: Descriptive Standards and Metadata Sharing across Libraries, Archives and Museums. Mary W. Elings, G√ľnter Waibel. First Monday. 5 March 2007.

The cultural heritage community has a large pool of digital resources for teaching, research and learning. A big challenge is integrating digital content from libraries, archives and museums which use different strategies for caring for their materials. Applying data content standards by material type rather than the organization could make the data more usable within the entire community. Two schema used are Visual Resources Association (VRA) Core and the Categories for the Descriptions of Works of Art (CDWA). The article lists the elements of metadata standards, and the relationship between them and museums, libraries and archives. There is a call among archives to process collections more efficiently so they achieve control over all their holdings. The successful use of digital materials in libraries, museums, and archives revolves around the ability to describe similar materials in different institutions.

Microsoft Announces HD Photo, a New Imaging File Format With Advanced Features for Today’s Digital Photographers. Press release. March 8, 2007.

Microsoft announced a new file format for that offers higher image quality, greater preservation of data, and advanced features. HD offers both lossless and lossy image compression. When compressed it has twice the efficiency of JPEG, with fewer artifacts. It preserves the entire original image. They also released a plug-in for Photoshop. [See Photoshop gets HD Photo support.]

Dell to ship PCs with 1TB drives. Chris Mellor. Techworld. March 16, 2007.

Dell will ship computers with Hitachi 1 TB drives, targeting users who wish to store large amounts of data. The computer can handle up to 4 TB. The drives use perpendicular recording. The 1TB drive is priced at $540. Dell is also introducing a 'video time capsule service' where users can upload videos to a site where Dell will store them for a claimed 50 years.

Blu-ray Aims to Oust DVDs Within Three Years. eWeek..

A Digital Life. Gordon Bell, Jim Gemmell. Scientific American. February 18, 2007.

New systems may allow people to record everything they see and hear--and even things they cannot sense--and to store all these data in a personal digital archive. The MyLifeBits project has provided the tools to create a person’s lifelong digital archive. Technological advances may make this easier but there are challenges, particularly with privacy rights and restrictions. They believe digital memories will yield benefits in many areas.

Hammer Storage Pounds Out 'Disruptive' 1TB Appliance. Chris Preimesberger. eWeek. March 22, 2007.

Hammer Storage has introduced Myshare, a new plug and play storage device with 1TB for $499. It can be used on a network, and the content can be made available through a web application, including selective access to folders. The content can also be mirrored, secured, and it allows multiple user and group permissions.

Wednesday, March 21, 2007

Scholarly Communication

Web 2.0 Presentation to BYU Library. Gideon Burton. Blog. March 13, 2007.

This was a presentation to the library on trends in Scholarly Communication, web 2.0, and other topics. This page includes his PowerPoint presentation and links to a video shown at the meeting, "The Web is Us/ing Us" by Michael Wesch.

Oops! Techie wipes out $38 billion fund

While doing routine maintenance work, the technician accidentally deleted applicant information for an oil-funded account — one of Alaska residents’ biggest perks — and mistakenly reformatted the backup drive, as well. There was still hope, until the department discovered its third line of defense, backup tapes, were unreadable. More than 300 cardboard boxes of paperwork has been scanned again.

Friday, March 16, 2007

Weekly readings - 16 March 2007

History, Digitized (and Abridged). Katie Hafner. New York Times. March 10, 2007

Archives and museums hold many important items that will probably not be digitized in the near future. This increases the possibility that they will be ignored as people expect more that all information is on the internet. A major problem is the cost of digitizing materials. Many items will still exist only in paper, LPs, magnetic tape and film. Libraries tend to digitize the items that are unique to their collection. But by putting the items on the internet, the number who use them increase dramatically. The LDS Church has initiated large scanning projects and hopes to have hundreds of millions of images online in the next five years. Others are digitizing collections which allow much broader access to the materials, but copyright is an issue. There is very little room in copyright law for preservation. The amount of material available can be overwhelming.

Director's Message. Anne-Imelda M. Radice. News & Events. March 2007.

The Webwise Conference: Stewardship in the Digital Age: Managing Museum and Library Collections for Preservation and Use highlighted the huge shift underway in museums and libraries. In a short time they have gone from knowing almost nothing about preserving digital objects to now understanding that digitization is an important part of conservation and use. Besides preserving the physical objects, institutions realize they need digital repositories for collections that are:

- physically vulnerable

- on fragile or unstable media

- born digital

Digitization protects historically important collections and addresses future collections. There is a great need to develop a new set of digital preservation skills in order to address digital objects. These collections can increase public awareness and interest in existing collections that may currently be unknown. Digital stewardship is an important part of the overall mission of libraries and museums as they care for their collections.

Quad-layer DVD Technology Becomes the Third HD Format. Marcus Yam. DailyTech. March 11, 2007.

New Medium Enterprises (NME) has developed the Versatile Multilayer Disc (VMD), a new optical-based format capable of storing 20GB of data. VMD is a red-laser technology that achieves its storage capacity by using a greater number of layers. VMD is the same size and thickness as DVD. However, while DVD technology uses two layers of a disc, VMD technology has multi-layering where up to 5GB can be stored on each layer.

Version 3.0 Launched. Mia Garlick. Creative Commons Website. February 23, 2007.

The latest version of the Creative Commons license is now available. A new generic license has been created. The new licenses ensure that there is consistent, express treatment of the issues of moral rights and collecting royalties and that there are no legal barriers to people being able to remix creativity as intended.

Intel Faces Up to E-Mail Retention Problems in AMD Lawsuit. Chris Preimesberger. eWeek. March 7, 2007.

A U.S. federal judge on March 7 gave Intel 30 days to try to recover about 1,000 lost e-mails that it was required to keep for an antitrust lawsuit. The suit was filed AMD (a competitor) in 2005. Intel could have been easily avoided the digital storage problems with careful planning. The new U.S. federal court rules enacted last December require companies to be able to quickly find data required by the federal court.

Principles for Digitized Content. ALA. Website. March 2, 2007.

An ALA task force introduced the draft Principles for Digitized Content. These have been put on the ALA blog and they are interested in comments. The principles in brief are:

1. Digital libraries ARE libraries and ALA policies and values apply fully.

2. Digital content must be given the same consideration as regular content, including preservation.

3. Digital collections must be sustainable and requires long-term management capabilities.

4. Digitization requires collaboration which will require strong organizational support

5. Digital activity requires ongoing communication for its success.

6. Digital collections increasingly address an international audience.

7. Digital collections are developed and sustained by educated staff, requiring continuous learning

8. Digital materials require appropriate preservation, including the development of standards, best practices, and models for sustainable funding to guarantee long term commitment.

9. Digital collections and their materials must adhere to standards, serve the broadest community of users, support sustainable access and use over time, and promote the core library values

Model Plan for an Archival Authority Implementing Digital Recordkeeping and Archiving. Australian Digital Recordkeeping Initiative (ADRI). 2 March 2007.

This 32 page Word document is a list of the components, tasks and resources needed to develop a digital recordkeeping / archiving capability. It addresses creating recordkeeping standards and developing a digital archives repository. It is based on the OAIS model. It outlines the strategies, functions, and tasks to develop, implement, and review the repository. The functions for preservation planning are:

- Monitor / interact with the designated community to understand requirements and changes

- Monitor the emerging technology and standards

- Develop and recommend preservation strategies

- Develop packaging designs and detailed migration plans and prototypes

- Implement administrative policies and directives

Friday, March 09, 2007

Weekly readings - 09 March 2007

Digital Repository Audit Method Based on Risk Assessment. Website. March 1, 2007.

The Digital Curation Centre and DigitalPreservationEurope have released the Digital Repository Audit Method Based on Risk Assessment (DRAMBORA) toolkit. This is to provide repository administrators with a way to audit and assess the capabilities, weaknesses, and strengths of their repository. This model is designed to respond to developments; it follows the DCC audit process for various types of archives. This 221 page document is very comprehensive and expects that the organizations, processes, and documents are already in place. Registration is required to download the document and worksheets.

Review and analysis of the CLIR e-Journal Archiving survey. Maggie Jones. JISC. 7 March 2007.

A review of e-journal archiving is now available. This site has a link to the report “E-Journal Archiving: Review and Analysis of the CLIR Report E-Journal Archiving Metes and Bounds: A Survey of the Landscape, by Maggie Jones”, as well as an executive summary. The trend to e-journals is increasing. LOCKSS and Portico have provided a momentum in this area. The report lists the basic principles for this effort, mostly for the services and nationally. Of special interest are:

  • There must be an explicit commitment to digitally archive scholarly peer-reviewed journals.
  • Form a network to exchange information with others on what you are doing.
  • Participate in at least one long term initiative.
  • Act collectively to address long-term accessibility
  • Participate in a registry of archived scholarly publications
  • Have a preservation mandate
  • Initiate either formal or informal certification and articulate practices and procedures
  • Create publicly accessible policies and procedural documents.
  • Clearly state access conditions
  • View the preservation of electronic journals as a necessary investment.
  • Support a range of options and solutions and collaborate

Manage risks appropriately. Having the publisher responsible for archiving is a high risk strategy. A single definitive approach to e-journal archiving is unlikely ever to emerge. The article provides an in-depth look at LOCKSS, CLOCKSS, Portico, PubMed Central, and others.

Electronic Resources Management and Long Term Preservation: (Is the library a growing organism?) Tommaso Giordano. E-LIS. 02 March 2007.

Digital preservation is a complex issue that involves many areas of expertise. This paper looks at how academic libraries see preserving e-journals and the organizational practices. License controls may control access to the current year, back issues, and an archival copy. Perpetual access and archiving right are different. Getting an archive copy is only the first step, there is more to implementing long-term access. Digital preservation may not be possible for every library. Digital preservation is a costly operation which considerable and long-term commitments, beyond current budgets. This is a high-level strategic issue. There is a shift from the traditional model to one based on renting resources with no guarantees for the future. Sustainability is a question.

The End of Online Storage: Coming Soon. Brian Bergstein. MCPmag. March 5, 2007.

A new study estimates that the amount of digital information the world is generating has increased dramatically. The study tries to account for photos, videos, e-mails, web pages, instant messages, phone calls and other digital content. They estimate that 40 exabytes of original data was created last year, and that it would be 161 exabtyes if you count the times it is duplicated. They wonder if enough is being done to save the digital data for posterity. "Someone has to make a decision about what to store and what not. How do we preserve our heritage? Who's responsible for keeping all of this stuff around so our kids can look at it, so historians can look at it? It's not clear."

Google helps terabyte data swaps. Darren Waters. BBC News. 7 March 2007.

Google is developing a program to physically transfer large data sets around the internet. Some data sets may be 120 terabytes in size. Google is collecting the data sets and sending them to scientists who want them. This was started after researchers working with The Archimedes Palimpsest had problems transferring the enormous data sets from place to place. “Google keeps a copy and the data is always in an open format, or in the public domain or perhaps covered by a creative commons license.” They hope some day the data can be open to the public.

Pre-Pixel Preservation: Concept Device to Archive and Preserve the Past. Naveen Shimla. Gizmo Watch. February 27, 2007.

An interesting look at “Pre-Pixel Preservation”, meaning scanning and printing photographs and videos, or rather the “pre-digital files”.

Friday, March 02, 2007

Weekly readings - 2 March 2007

Library Copyright Alliance Strongly Supports H.R. 1201, the FAIR USE Act. Jonathan Band. District Dispatch. February 28, 2007.

The FAIR USE Act would make a film clip exemption applicable to all classrooms not just college media studies classes. It would also allow a library to legally circumvent technological protections in order to preserve encoded works in a library's collection. Preservation is one of the most critical library functions. The DMCA provides obstacles to a library’s ability to preserve some objects; the FAIR USE Act will remove the obstacle without harming copyrights.

Hard Disk MTBF: Flap or Farce? David Morgenstern. eWeek. February 28, 2007.

Computer hard drive reliability is rated in hours: known as MTBF (mean time between failures). Some drives are rated at 1.5 million hours, which is over 100 years. Now some are questioning this statistic. They have found that replacement rates are sometimes as high as 13%, and there is little difference between SCSI, FC, and SATA drives. In reality, there isn't a reliable way to statistically determine the reliability of disk drives that are in use. Expecting that data won’t get lost is not realistic, even with a Raid array. Budgets must include regular hard drive replacement.

Back stage at the Oscars with Martin Scorsese: The press interview. Movie News. February 27, 2007.

This interview contains a brief discussion with Martin Scorsese about film archiving and preservation. He is a proponent of film preservation and film archiving. “It's very important. We don't know what new technology is coming down the line. Digital also fades. We have to be very careful.” There is so much to be done, but choices have to be made. They are trying to restore older films on celluloid, but it is expensive and only a very few can be done. If the only thing you can do is transfer them to digital to preserve them, then that may have to be done. With small budgets, preservation is a serious problem.

Adobe to take Photoshop online. Martin LaMonica, Mike Ricciuti. CNET News. February 28, 2007

Adobe plans to release an online version of Photoshop within six months. The lower-end version is expected to be free and will be an entry-level version of the full products. Adobe is trying to find how to use web services for their products. Other vendors have already started doing similar things. Google’s free Picasa program allows users to manage images and to read Adobe Photoshop files.

California may join rush of states toward ODF. Eric Lai. Computerworld. February 27, 2007.

A bill in the California legislature would mandate open, XML-based document file formats in the state government starting next January. Several other states are also considering this. This initiative seems to support the Open Document Format used in OpenOffice.