Friday, November 24, 2006

Weekly Readings - 24 November 2006

Weekly Readings – 24 November 2006

Beneath the Metadata: Some Philosophical Problems with Folksonomy. Elaine Peterson. D-Lib Magazine. November 2006.

In today's world of digital information, classification is very important. Folksonomy has emerged as an alternative to traditional classification. It is defined as

"an Internet-based information retrieval methodology consisting of collaboratively generated, open-ended labels that categorize content such as Web pages, online photographs, and Web links". The labels are called "tags", and are user-created rather than author or cataloger created. Unlike traditional controlled vocabularies (or taxonomies), folksonomies are informal, unsystematic and, to some, unsophisticated. Many internet users like them because they are easier to create, there is no hierarchically organized classification scheme, less complicated, and thus less expensive to create. The problem is that they can be messy, contrary, or inaccurate.

Traditional classification schemes are more consistent and tend to yield more exact search results, but are also more time consuming to create and more limiting. Folksonomy allows for disparate opinions and the display of multicultural views. The two systems approach information classification from different philosophical directions.


The Digital Ice Age. Brad Reagan. Popular Mechanics. November 21, 2006.

A popular article about the problems with digital preservation. One issue that is emphasized is that even if some files are still available, they may not be exactly the same with later versions of the software. Technical drawings for example may actually be different when opened with later versions. Other files are at risk of being unreadable, and it lists examples of lost data. Sometimes the problem is noticed immediately, as when something disappears, but in other cases, the problems may be invisible right now. “If the software and hardware we use to create and store information are not inherently trustworthy over time, then everything we build using that information is at risk.” Archiving has become a more complex process than it has been. Some have found that metadata is not properly transferred when files are copied. Nara has said, “We don't know how to prevent the loss of most digital information that's being created today.” They have identified over 4500 file types that they need to account for. There are questions about whether migration or emulation will succeed. Adobe is working on solutions for documents and images. Solutions suggested include backups, long-term media, data recovery, and printing important items.


Digital asset management software helps N.Y.'s Met preserve art collections. Todd Weiss. Computerworld. November 24, 2006.

The Metropolitan Museum of Art is replacing its film-based photo collection inventory with digital images in a centralized catalog using MediaBin Asset Server digital asset management software. “Being able to take photographs digitally means that prints or negatives will no longer deteriorate over time.” And they will meet one of the museum’s goals by making items available digitally to the public around the world.

Friday, November 17, 2006

Weekly readings - 17 November 2006


Author Addenda: An Examination of Five Alternatives. Peter B. Hirtle. D-Lib Magazine. November 2006.

When an author publishes a book or a paper, many publishers ask the author to transfer all copyrights in the work to the publisher. But that is not always to the author's advantage. One solution is the Author's Addendum. An addendum is a standardized legal document that modifies the publisher's agreement and allows the author to keep certain rights. The addendum may specify what rights the author does or does not have in areas such as:

· Author's Rights

· Author's Authorization Rights

· Use by the Author's Institution

· Institutional and Repository Rights

While the addendum may not be perfect, it can be an important tool so authors can retain the rights they want for themselves or their employing institutions. Sponsors should agree on a few standard addenda that all can use instead of issuing their own custom versions of the documents.


Broadcom claims first universal DVD chip. Dylan McGrath. EE Times. November 9, 2006.

Broadcom has introduced a single chip that supports both Blu-ray Disc and HD-DVD standards. The chip supports all profiles of both specifications. The universal device they have created supports all decoding, processing and memory functions of both specifications, including a number of other standards, such as MPEG-2, DVD-R, DVD-VR and audio CDs.


Colorado Alliance Digital Repository Project Approved. George Machovec. Press release. October 20, 2006.

The Colorado Alliance of Research Libraries has approved funding for a consortium-wide digital repository project. The project will allow the members to store, preserve and distribute digital objects such as images, text, audio, video, learning objects, and data sets. It will use open source utilities to speed up the development and will use the Fedora software. They consider Fedora to have “excellent long-term prospects” to be the best platform for the project. Each library will be able to have its own view of the system. Two staff will be added for the project, and the alliance is hoping to work with other libraries.


VHS, 30, dies of loneliness. Diane Garrett. Variety. November 14, 2006.

A clever eulogy for VHS tapes. Newer forms of media have taken the place of VHS, which has lasted for about 30 years. Studios have stopped manufacturing the tapes.

Friday, November 10, 2006

Weekly readings - 10 November 2006


Petascale storage may trickle down to you. Gary Anthes. Computerworld. November 06, 2006.

Questions about computer speed and storage will become the focus of the Petascale Data Storage Institute. “The overall goal is to make storage more efficient, reliable, secure and easier to manage in systems with tens or hundreds of petabytes of data spread across tens of thousands of disk drives, possibly used by tens of thousands of clients.” This may not be the system that you are using, but the research will benefit other users. High performance computing is more about quality than cost reduction. Problems they are trying to resolve include:

· Slow disk access times as disk size increases

· Reducing the failure rate of disks

· Better fault tolerant systems

· More efficient file systems


Scirus Partners with Indian Institute of Science. Press Release. November 1, 2006.

Elsevier has announced that it will partner with the Indian Institute of Science to index their two institutional repositories. The first is a research repository that uses GNU EPrints software to archive their preprints, post-prints and other scholarly publications. The second is their digital repository of Theses and Dissertations which has been developed to capture, disseminate and preserve their research theses. The repository uses DSpace and allows them to self archive their theses and dissertations.


IBM fattens tape capacity to 700GB. Robert Mullins. Computerworld. November 01, 2006.

IBM has introduced a tape cartridge that holds 700G bytes of data with the same size as lower-capacity tapes. The 700GB cartridges are available for both permanent read-only archiving and re-writable. The tapes are best suited for long-term data storage.


Content as a Digital Asset. Puneet Vohra. Hindustan Times. November 8, 2006.

Content on a website is an asset. Presently it may not be considered valuable, but the value may increase in the future. “Maintaining archives is equivalent to preserving your heritage.” It is important to preserve your digital heritage, and websites are part of that heritage. “An archive needs to recreate the look and feel of the original record.”

Friday, November 03, 2006

Weekly readings - 03 November 2006

Using Digital Images in Teaching and Learning: Perspectives from Liberal Arts Institutions. Wesleyan University and The National Institute for Technology and Liberal Education. October 2006. [Sometimes the blog doesn't handle the link; if so, paste this link in your browser:
http://www.academiccommons.org/imagereport ]


A report that looks at digital resources in teaching and learning situations. Faculty use digital images mostly from personal collections (90%) or from Google Images. “Many faculty need considerable assistance in organizing and managing these collections, ” such as providing cataloging and management tools. Often faculty need better quality images that those in Google Images. There are free databases of images, but few faculty use them, possibly because they don’t know they exist. The LionShare project, for example, has created a way for peer-to-peer file exchanges among faculty worldwide.

The majority of faculty never use licensed or library image databases. The institution should bring together image collections from different departments, museums and special collections into a single institutional collection, though it is a difficult and complex undertaking. Faculty have said “using digital images had revolutionized their teaching.” The complexity, difficulty and expense of deploying digital images and the transition, is a longer, more ongoing process than we have expected. It is more of an ongoing process than a transition.

They recommend:

· Develop and share tools and services to help faculty organize, catalog, and manage their personal digital collections, in a user-centered content model.

· Encourage and enhance the relationship between individual personal digital image collections and the evolving institutional collection.

· Publicize and direct users to especially good online image resources in any given subject area.

· Publicize and demonstrate locally-available digital image resources to faculty and, where possible, research faculty’s most pressing digital image needs in order to match them with available resources.

· Create institutional collections serving many departments

· Develop a plan with faculty to provide digital image services when closing analog slide collections

· Publicize new tools as they become available


Long-term Stewardship of Digital Data Sets in Science and Engineering. ARL Report of workshop held on September 2006.

This 160 page report examines the role of research and academic libraries in the stewardship of scientific and engineering digital data. The stewardship of digital data is fundamental to research. Because of the large challenge regarding digital data stewardship, responsibilities should include partnerships with institutions and disciplines. Universities have played a leadership role in the long-term preservation of knowledge through their libraries. “Stewardship of digital resources involves both preservation and curation. Preservation entails standards-based, active management practices that guide data throughout the research life cycle, as well as ensure the long-term usability of these digital resources. Curation involves ways of organizing, displaying, and repurposing preserved data.” There needs to be a close link between digital data archives and scholarly publications. Preservation occurs throughout the lifecycle or stages of data production.

Preservation consists of

· the management practices based on standards for the metadata and data throughout the research life cycle

· the long-term care for these digital products.

· the standards-based output of metadata and data for their long-term care, access, migration and refreshment.

Preservation of digital data has forced a re-examination within the library/archives community of existing assumptions about responsibility, use, oversight,

and cost. Long-term preservation and curation are understood as preserving and reading bits, but also as a system that requires cooperation across many organizations and stakeholders in a sustainable model. Preservation is both an organizational and a technical challenge. The OAIS model is a useful mechanism for preservation.

Data preservation has distinctive requirements for

· Resources: storage, systems, maintenance, services

· Continuity: migrate without interruption

· Metrics of success: no serious loss of data

· Funding: address long-term commitment

In building a preservation model, some research topics include: prototyping different types of technical architectures; specifying ingest systems at different scales; deploying data models across organizations; and creating tools for automatic metadata harvesting.

Academic libraries need to expand their work to include storage, preservation, and curation of digital scientific and engineering data. This requires evaluating where in the research process chain the preservation activities should occur, who should do them, and how they should be accomplished. In discussing models for economical sustainability of such activities: one model was “The Mormon Church, which combines tithing, user fees, and sales.” Multiple strategies will be required to meet the different circumstances that exist. Repository experiments should address key issues such as transition between media/formats/institutions, self-sustainability, and exit strategy.

Some recommendations include:

· Involve experts in developing economic models for sustainable data preservation

· Set up multiple repositories and treat them as experiments.

· Develop tools for automated services and standards to manipulate data easily

“Digital information is fragile and we do not have the luxury of letting time take its course.” There needs to be sustainable framework for long-term stewardship of digital data. “We don’t get anywhere if we don’t start somewhere.”