Friday, May 26, 2006

Weekly readings - 26 May 2006

Data Dictionary for Preservation Metadata.  OCLC/RLG.  May 2006.

http://www.oclc.org/research/projects/pmwg/premis-final.pdf

This is the final report of the PREMIS Working Group that examines preservation metadata.  The 237 page document includes the PREMIS Data Model and Data Dictionary; examples, methodology, and implementation considerations. The report defines preservation metadata as “the information a repository uses to support the digital preservation process. Specifically, the group looked at metadata supporting the functions of maintaining viability, renderability, understandability, authenticity, and identity in a preservation context.” 

 

Aftermarket Inks Fading Fast? Hard Copy Supplies Journal Announces Wilhelm’s Surprising Test Results.  PRWeb.  Press Release.  May 25, 2006.

http://www.prweb.com/releases/2006/5/prweb390347.htm

Wilhelm Imaging Research (WIR) 6 that the image permanence of photos printed with aftermarket ink jet cartridges and photo papers is far inferior to that of photos printed with original equipment manufacturer (OEM) ink jet cartridges and photo papers.  The image permanence is an intrinsic part of product quality.  In some cases they see a difference of 70 years of permanence ratings between OEM and aftermarket products.   Making ink jet dye is simple if they ignore permanence.  But it is more difficult to make print products with both high quality and high permanence.   “As a group, the aftermarket inks and premium photo papers in this study had among the lowest WIR display-permanence ratings of any products ever tested by our lab.”  Early print products would degrade in under a year.  Now high quality products can last decades, and some have exceeded the 100 year mark.  Once manufacturers have the image quality right, they can move on to image permanence.  “It is clear that consumers have no idea just how poor the permanence—and thus the overall quality—of these products actually is.” 

 

User survey reveals ILM reality.  James E. Short.  Infostor.  May 24, 2006.

http://www.infostor.com/Articles/Article_Display.cfm?Section=ARTCL&SubSection=Display&PUBLICATION_ID=23&ARTICLE_ID=255966

A recent survey showed that managers in data storage, IT, and records management have widely differing views about information lifecycle management (ILM).  There are multiple definitions of what it is, and if it will create more problems that it will solve.  A majority of those who responded defined ILM as “a policy-based approach to improving records and information management”.  Others saw it as “a technical and systems management issue.”  Part of the concern is that technology is seen as the solution, where others see it as management agreement / control over data across functions departments and people.  Respondents felt that there were more drawbacks than advantages, but also felt that if it is properly defined and implemented, ILM could potentially improve management control over data and reduce storage costs.

 

Canon Considers Halt to Film Camera Development.  Reuters.  May 25, 2006. 

http://www.eweek.com/article2/0,1759,1967639,00.asp?kc=EWRSS03119TX1K0000594

Canon has said that it would consider stopping development of new film cameras as it focuses on digital cameras and because the market is shrinking.  A final decision will be made in the future while they monitor market demand.  Nikon has already stopped producing most of its film cameras;  Konica Minolta has decided to exit the camera and photo film markets because of losses and low demand.

 

Heritage Microfilm Introduces New Archival and Subscription Program.  PRWeb.  Press Release.  May 25, 2006

http://www.prweb.com/releases/2006/5/prweb389560.htm

Starting in June 2006, Heritage Microfilm will release a new program aimed at bridging the gap between traditional and digital archiving.  The program is built around the ideas of “Preserve,” “Protect” and “Prepare.” Focusing It is designed to bring libraries and historical societies together with newspaper publishers, and has two separate components. One is designed for newspaper organizations and involves the microfilming and digitization of newspaper pages. The other enables libraries to access this content in both microfilm and web-based digital format.  “While microfilming, a stable and analog technology, will never be superseded by digital technologies for long-term preservation, the digital component allows access on a scale never before breached by microfilm."   It will also release a product called DigitalMicrofilm which is billed as an alternative to traditional microfilm subscriptions. Publisher pre-press files, uploaded weekly into the Heritage system, will be delivered through a digital archive website to the subscribing library. This allows librarians and their patrons to access fully-searchable newspaper content previously available on microfilm.

 

Friday, May 19, 2006

Weekly readings - 19 May 2006

Apago Introduces PDF Appraiser for Creating and Validating PDF/A Documents. Press Release. Business Wire. May 16, 2006.

http://home.businesswire.com/portal/site/google/index.jsp?ndmViewId=news_view&newsId=20060516005392&newsLang=en

A software developer has introduced PDF Appraiser, the first program to support both validation and automatic correction of documents in accordance with PDF/A, the new international standard for long-term archiving of digital documents. PDF/A is the much-anticipated standard for long-term archiving of digital documents, and it is expected to become the preferred archival method for governments and industry segments, including corporations, legal, libraries, regulated industries, and others. The vendor is offering PDF/A validation capabilities at no cost. This allows users can easily check documents for PDF/A compliance. The full version is required to correct any problems within the file. http://www.apagoinc.com/pdfappraiser

The End User: From France, video for all. Victoria Shannon. International Herald Tribune. May 17, 2006.

http://www.iht.com/articles/2006/05/17/business/ptend18.php

The French national audiovisual institute, which has been digitizing its film and audio collection, and has created the "Archives Pour Tous" - archives for all. About 80% of the archive, which contains thousands of hours of radio and television recordings, is on the Internet for free. It contains historic footage of Charles de Gaulle, Marc Chagall, and many others. The site is receiving about five million visits a day. "To us at INA, preserving archives would be pointless if that was to keep them only for a 'happy few.' It is INA's mission to communicate and make this vast wealth of archive images as widely accessible as possible using the latest digital technology, yet preserving them as the nation's heritage for future generations."

IBM researchers extend magnetic tape density. Sharon Fisher. Computerworld. May 16, 2006.

http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9000532&source=NLT_AM&nlid=1

IBM and FujiFilm announced a new technology that could hold as much as 8TB of uncompressed storage in the future. With the new technology, their tape library would be able to hold 48 petabytes. There was no availability date, but it is projected to be a few years away. The intent is to show users that tape has a long life ahead. It is cheaper than disk storage, and provides other options. A 1TB tape cartridge should be available next year.

Micron prepares to sample 8-Mpixel image sensor. Peter Clarke. EE Times. May 18, 2006.

http://eet.com/news/latest/showArticle.jhtml%3Bjsessionid%3DGXYJDIMFMJTYEQSNDBESKHA?articleID=188100013

Micron Technology Inc. has made an 8-Mpixel image sensor in a 1/2.5 inch optical format suitable for digital still cameras and mobile phones. The sensor can capture images that can be displayed at 11-inch by 14-inch size and is capable of taking 10 full resolution pictures per second. The sensor can also take 2-Mpixel images at 30 pictures per second for capturing video at 30 frames per second.

Friday, May 12, 2006

Weekly Readings - 12 May 2006

Cultural heritage in danger. heise online. 12 May 2006.
http://www.heise.de/english/newsticker/news/73037

Increasingly, research publications are only available in digital form. But the formats used are becoming outdated. A "memorandum for the preservation of digital information in Germany"
calls on the government, producers of information, hardware and software vendors, libraries, and archives, to pay more attention to the preservation of such data. The memorandum presented recommendations on what needs to be done in general to create a German "long-term archive policy." There needs to be clear selection criteria. Digital archives would have to preserve the content and functionality of this data as completely as possible. Specialized depot systems should be developed to prevent the loss of data through redundancy and mirroring. It also recommends the use of non-proprietary, open, well documented formats to ensure readability over the long-term; wherever possible, such formats should be used to create the data. (http://www.langzeitarchivierung.de/downloads/memo2006.pdf)


Could the future of storage be all wet? Lucas Mearian. Computerworld. May 12, 2006.
http://www.computerworld.com/action/article.do?command=viewArticleBasic&articleId=9000446&source=NLT_AM&nlid=1

Researchers have found a way to use barium titanium oxide nanowires suspended in water to create digital storage. Experiments show they can hold 12.8 million GB per square centimeter. This is early-stage research which will be interesting to watch. One limitation with today's tape and disk magnetic storage is that a magnetic domain is not stable at molecular levels. "The magnetization will spontaneously flip back and forth because of [temperature] fluctuations."



Sony Delivers HD Quality With Blu-Ray Disc Media. Computer Technology Review. May 9, 2006.
http://www.wwpi.com/index.php?option=com_content&task=view&id=1161
Sony Electronics has begun shipping its 25GB single layer Blu-ray Disc recording media. The dual layer 50GB capacity disc is set to debut in June. The disc features include:
- Scratch Guard -- hard coating that resists scratches, dust and static
- Archival Reliability -- special material design that prevents data and image corruption and deterioration to ensure quality playback
- Stable Writing -- a uniform and precise cover layer that reduces fluctuation as the disc spins
- Temperature Durability -- a high-precision disc structure helps prevent warping during severe changes in temperature and humidity.
Sony will soon offer a wide range of Blu-ray Disc devices.


An ad agency warning: ignore digital at your peril. Francesca Newland. Media Bulletin. 12 May 2006.
http://www.brandrepublic.com/bulletins/media/article/558855/an-ad-agency-warning-ignore-digital-yourperil/

“Getting to grips with digital is not about setting up an online division and handing over the briefs, it's about making digital solutions as front of mind….”


Friday, April 28, 2006

Weekly readings - 28 April 2006

The Digital Black Hole. Jonas Palm. National Archives of Sweden. April 27, 2006.

http://www.tape-online.net/docs/Palm_Black_Hole.pdf

Without long-term planning, digitization projects can be like black holes. Information is only retrievable through technology which has a cost. The more information that is converted, the higher the maintenance costs. If funding fades, the files may soon be obsolete and would be lost. For large projects, the life cycle must be planned, which includes a financial commitment. The archive has asked the questions: once materials are digitized, is it cheaper to maintain the digital files over time, or rely for long-term storage on

images on microfilm produced from the digital files with the use of COM (Computer Output Microfilm). It was expensive to preserve digital files, and the cost is more than generally believed because it involves much more than most people realize. The idea that media storage capacity gets cheaper because it doubles each year is true in the short term, but not in the long term, since the costs of management will keep going up. The real cost of storage is management.; the labor cost accounts for 39% of the total storage cost. The cost of long term storage also depends on how much it is used and accessed. The cost of digitization is also high. Digitizing audiovisual information is very time consuming and it also creates huge amounts of digital information. It is also the only possibility to preserve materials for the future. A third of the cost goes to scanning. A Swedish study states that for AV: “Due to condition and technical circumstances transfer should be made within the next ten years.” It must be digitized in the near future as media deteriorate and equipment becomes obsolete. The archive is looking at the possibility of using COM for preservation. Whichever strategy is chosen it must include a long-term financial commitment.

Library Holds Strategy Session on "Preserving Creative America". Managing Information. 28 April 2006.

http://www.managinginformation.com/news/content_show_full.php?id=4866

A Library of Congress strategy meeting with leading producers of commercial content has shown that the content creators are very interested in preserving their digital materials for archival and other purposes. “We are faced with the potential disappearance of our cultural heritage if we don’t act soon and act together to preserve digital materials.” Preserving content long-term depends on influencing content providers from the moment of creation. They are focused on potential partnerships between the Library and the private sector. This year the Library plans to issue a request to private industry for cooperative projects to catalyze preservation in the private sector. The Library will support the establishment of preservation activities that span content owners and distributors, as well as technology companies.

Preserving the Past. Anna Bengel. backstage.com. April 28, 2006.

http://www.backstage.com/bso/news_reviews/features/feature_display.jsp?vnu_content_id=1002425082

The Women's Film Preservation Fund was founded to preserve early silent and color pictures, experimental and documentary films, and "orphans," films without a clear copyright holder, that are by or about women. “Preserving film is a relatively simple process, but not necessarily an easy one.” Preserving a film means making new negatives and prints from the existing film, which can then be duplicated without ever having to touch the original film. It is a costly process. There is a push to get preserved performances in videotape or digital format, which can be borrowed and accessed by companies or private institutions. “It's impossible to preserve everything; the volume is staggering and there is not enough time and money.

Seagate's huge hard drive performs well. Melissa Perenson. Computerworld. April 27, 2006.

http://www.computerworld.com/hardwaretopics/storage/story/0,10801,110934,00.html?source=NLT_DIS&nid=110934

Seagate has launched a 750GB drive, the largest hard drive to date. It excelled in capacity, price and performance. It can write a 3GB file in about 2 minutes. It is available now and is priced under $600.

HD-DVD & Blu-Ray: Dead Formats Walking. David Morgenstern. eWeek. April 20, 2006.

http://www.eweek.com/article2/0,1759,1951445,00.asp?kc=EWRSS03129TX1K0000606

One of the problem with the optical formats is the lack of ability to recycle the discs. Because of the large number of discs that are discarded each year, some are calling for a recycling surcharge and stricter rules on disposal. This may even affect future adoption of similar media. The future trend may be more towards networks than optical discs.

Friday, April 21, 2006

Weekly Readings - 21 April 2006

MIC (Moving Image Collections). Jane D. Johnson. RLG DigiNews. Apr 15, 2006.
http://www.rlg.org/en/page.php?Page_ID=20916#article1

Moving Image Collections was created as a partnership between the Library of Congress and the Association of Moving Image Archivists and began as a preservation initiative, a collaborative effort to promote discovery, preservation, and educational use of moving image materials. It provides a union catalog, archive directory, and informational resources and support for collaborative preservation, access, digitization, exhibition, and metadata initiatives. It raises awareness about preservation issues and risks to our film, television, and video heritage by telling readers how to care for home collections, the role of archives, and the preservation process. It is not just a tool for archivists, but provides access for the public and educators which is the key to a sustainable preservation strategy. The directory can take users to the organization’s records in the Union Catalog, to the organization’s own catalog, or to its website. Central to the database is the metadata elements that provide descriptions of the items. There are mappings to MARC, Dublin Core, MPEG-7 and others, plus the ability to map other schema into the collection.


Six Lessons Learned: An (Early) ARTstor Retrospective. Max Marmor. RLG DigiNews. Apr 15, 2006.
http://www.rlg.org/en/page.php?Page_ID=20916#article0
ARTstor is a digital library of images for educational and scholarly use. Here are some of the lessons learned:
- the importance of building a campus-wide resource instead of discipline-specific collections;
- the importance of digital images in teaching and research;
- the ramifications of such a resource for “buy vs. build” decision
- the trade-offs of building “user-driven” collections
Users want to do things with digital images, to change them and assemble in different ways. They need tools in order to integrate the materials into teaching and research.


Library of Congress, British Library to Support Common Archiving Standard for Electronic Journals. Guy Lamolinara. April 19, 2006.
http://www.loc.gov/today/pr/2006/06-097.html

The Library of Congress and the British Library have agreed to support the migration of electronic content to the NLM DTD standard. The libraries hope that their support of this standard will help ensure long-term access to electronic journals. The standards for digital materials are still evolving. By supporting this, they hope it will lead to an internationally recognized standard.

Friday, April 14, 2006

Weekly Reading Notes 14 April 2006

Corporate Alzheimer's: Coping With Forgotten File Formats. John K. Waters. Law.com. April 4, 2006.
http://www.law.com/jsp/ltn/pubArticleLTN.jsp?id=1144067962734
What if the file formats of our text documents, spreadsheets, charts and presentations were not supported by future versions of the programs used to create them today, or by other future products? Could the inability to read file formats cause a kind of corporate Alzheimer’s that threatens our ability to recall contracts, insurance policies, financial records, payroll data and other critical documents? In some ways, this is happening today. Some documents that are only 10 years old are inaccessible now. If the baseline file format continues to evolve as it has done, formats may be unusable in 10 or 11 years. We need a standard for documents that need to be kept indefinitely, and two are being discussed: OpenDocument Format for Office Applications (ODF), which was developed by the OASIS standards consortium, and Microsoft's Office Open XML. Both are based on XML. Microsoft has promised that their version will be compatible with their older document formats, that they would give it away, and that it would be controlled by an international standards body.

Sony reveals more details of Blu-ray Disc PC. Martyn Williams. IDG News Service. 13 April 2006.
http://www.arnnet.com.au/index.php/id%3B571802279%3Bfp%3B256%3Bfpid%3B56735
Sony has released some details about its first desktop PC to include a Blu-ray Disc drive. It will include reader/writer Blu-ray Disc drives that support single-layer 25G-byte discs or double-layer 50G-byte discs. They plan to make them available in "early summer" in the U.S.

Library of Congress preserves Motown recordings. Alison Bethel. The Detroit News. April 12, 2006.
http://www.detnews.com/apps/pbcs.dll/article?AID=/20060412/ENT04/604120411/1033/ENT01
Library of Congress selected 50 sound recordings for preservation because of their cultural, historical or esthetic significance. The National Recording Registry was established by the Library of Congress as part of the 2000 National Recording Preservation Act to preserve the most significant recordings and to highlight the need to preserve the country's sound recording legacy before it deteriorates.

Microsoft to Make Virtual Server Free. Peter Galli. April 3, 2006.
http://www.eweek.com/article2/0,1759,1945069,00.asp
Microsoft has announced that it will make its Virtual Server product available on the internet for free. They also plan to incorporate this into the server operating system.

Friday, April 07, 2006

Weekly Readings - 7 April 2006

Institutional Repositories: An Opportunity for CIO Campus Impact. Marilu Goodyear and Richard Fyffe. EDUCAUSE Review. March/April 2006.
http://www.educause.edu/apps/er/erm06/erm0626.asp?bhcp=1

“The importance of the research enterprise calls for paying significant attention to the stewardship and preservation of the institution’s digital assets, particularly those that are unique to the campus.” Often, current systems do not provide short-term access, nor long-term access, well. The collections are mostly uncurated collections of important and trivial information, current and superseded work, hosted on platforms with no checks for data integrity, minimal metadata for provenance, little encoding for version or access control, and no support for format migration. In reality, they have none of the structures and functions provide assurance of ongoing accessibility and usability for digital files. Repositories can be more than tools for sharing information, they can be tools for preservation. Currently there is no repository that can do all the digital preservation functions. They mostly provide a way to select, describe, and depository materials in a central location. This can encourage understanding and discussion of the conditions that make digital preservation possible. Involving the IT organization can provide a number of benefits:
It can serve faculty in all disciplines
It can demonstrate the value of research contributions to the institution
The CIO can stay ahead of issues of academic research and digital preservation
Maintaining digital assets should be a matter of concern to the CIO


Surveying the E-Journal Preservation Landscape. Anne R. Kenney. March 23, 2006.
www.clir.org/pubs/archives/ejournal.htm

"Digital preservation represents one of the grand challenges facing higher education." Preserving electronic publications has become a critical matter as e-publication increases and user communities depend more on electronic publications. A number of initiatives acknowledge preservation responsibility for e-journal archiving.


A faster, denser hard drive debuts. Jon L. Jacobi . Computerworld. March 31, 2006.
http://www.computerworld.com/newsletter/0,4902,110111,00.html?nlid=ST

The first hard drives to use the perpendicular storage technology show that the drives hold more, but are faster. They are projecting that 2TB disks will be available before long. The cost will be about the same per gigabyte and existing disks.



Sony's Universal Media Disc facing last rites. Thomas K. Arnold. Computerworld. March 30, 2006.
http://www.computerworld.com/newsletter/0,4902,110083,00.html?nlid=ST

After only one year, Sony's Universal Media Disc is being phased out. Disappointing sales and lack of support have caused retailers to discontinue selling the disc. One executive compared this failure to blu-ray.



Holographic disk hits 300GB mark. Chris Mellor. Computerworld. March 28, 2006.
http://www.computerworld.com/newsletter/0,4902,109994,00.html
InPase Technology Inc. has announced that it has created a holographic disc which can store more data than any other disc. It plans to release the 300GB later this year. A holographic disc stores data inside the disc, not just on the surface. It is also faster for retrieving data.

Friday, March 24, 2006

Weekly readings - 24 March 2006

Digital Curation and Preservation: Defining the Research Agenda for the Next Decade.  Philip Pothen.  Ariadne.  February 2006.

http://www.ariadne.ac.uk/issue46/warwick-2005-rpt/

    It is clear that accessing and preserving digital data is increasingly important across a wide range of scientific, artistic and cultural activities.  We need more information about deciding ‘how’ to preserve that ‘if’ we preserve.  Fewer than one software package in ten lasts beyond 10 years.   Overcoming the protectiveness of data is one of the highest priorities in this area.  We need to see spending decisions more as investments with a clear view of the costs and benefits.  It is important to examine the social and organizational benefits of preservation.  While the barriers between libraries, archivist and technical specialists are breaking down slowly, we must address the broader question of training and education.  We need to keep the knowledge to be preserved independently from the underlying systems.  We need to develop certification criteria, checklists to determine complexity and cost, and new research. 


---

Capturing Analog Sound for Digital Preservation: Report of a Roundtable Discussion of Best Practices for Transferring Analog Discs and Tapes.  National Recording Preservation Board.  March 2006.

http://www.clir.org/pubs/reports/pub137/pub137.pdf

      The National Recording Preservation Board was created to sustain sound records for future generations.  “Authoritative manuals on how to create preservation copies of analog audio recordings do not yet exist.”  This report will investigate procedures to reformat analog sound to digital media. It summarizes discussions and recommendations from leading audio preservation engineers concerning the present standards and best practices for migrating analog recordings. It gives an overview of the problems encountered and the needs.  There are some recommendations for actions, competencies that should be developed, and a call to share expertise to help in this area.  Some of these include:

·       For discs: Clean the disc when possible;  choose the correct stylus size and playback speed
·       For tapes: identify and clean the tape; address splices and damage;
·       Know the medium
·       Note all metadata with the original
·       Identify the core competencies needed
·       Develop a web-based clearinghouse for information
·       Identify experts to consult
·       Develop project guidelines and best practices within the organization

      The second half of the document outlines recommended practices, competencies, and commentary from the meeting participants for transferring audio to digital media.  They also list resource documents that we should create, especially suggested equipment to perform digital audio archiving tasks, sources of equipment and supplies. 

---

Fed up with tape, hospital moves to storage jukebox . Lucas Mearian.  Computerworld.  March 24, 2006.

http://www.computerworld.com/hardwaretopics/storage/story/0,10801,109880,00.html?source=NLT_PM&nid=109880

    This article gives an example of a hospital which installed a new image and records archiving system late last year.  It chose an optical disk jukebox with spinning disk arrays over magnetic tape because it had stopped trusting magnetic tape.  “If you can’t access the data, then whatever you spent on the tape was a waste.” They had tapes go bad after only 50 uses.  They chose a “near-line” storage system, a 13TB optical jukebox  model containing 30GB platters.  It has a two-tier storage infrastructure, where all data is stored in an array for the first two years and then migrated to optical disk, where it’s copied to two platters; one off-site for disaster recovery and the other on-site for near-line storage.

---

Archaic Sounds Reach Modern Ears.  Rachel Metz.  Wired. 20 March 2006.

http://www.wired.com/news/technology/0,70378-0.html?tw=wn_index_5

    Curators at the UC Santa Barbara Library have digitized 6,000 19th- and 20th-century wax and plastic cylinder recordings of music, vaudeville routines and presidential speeches.  Preserving the sounds is vital because the cylinders are deteriorating.  This area has been neglected for many years.  Until recently it was not possible to create quality digital copies of cylinder recordings because cylinders running at different speeds each required different equipment.  Now a system has been created that can play cylinders of various sizes and speeds and transfer the sound to a computer through a patch bay.  It encodes cylinder music as original-sounding WAV files or cleaned-up MP3 versions.  Since November when the site started, over 700,000 recordings have been downloaded.  The recordings on the site are in the public domain and cleaned-up MP3 versions hold a Creative Commons license.


Saturday, March 18, 2006

Weekly readings - 17 March 2006

Excuse Me... Some Digital Preservation Fallacies?  Chris Rusbridge.  Ariadne.  February 2006.

http://www.ariadne.ac.uk/issue46/rusbridge/

The article looks at a number of issues with digital preservation that the author feels are fallacies.  They are:

   1. Digital preservation is very expensive [because]

   2. File formats become obsolete very rapidly [which means that]

   3. Interventions must occur frequently, ensuring that continuing costs remain high.

   4. Digital preservation repositories should have very long timescale aspirations,

   5. 'Internet-age' expectations require the preserved object must be easily and instantly accessible in a useable format, and

   6. the preserved object must be faithful to the original in all respects.

 

All preservation, including paper and book preservation, is expensive.  Digital preservation as a whole compared to paper and book needs may actually be less.  While consumer formats may go out of fashion, very rarely are any formats that are completely obsolete.  Recovery of information from old files can be incomplete.  Mass access to the internet has stabilized the formats.  Part of the key to this is to share the information.  This may be more of a problem with extended time frames.  “Investment in digital preservation is important for cultural, scientific, government and commercial bodies. Investments are justified by balancing cost against risk; they are about taking bets on the future. The priorities in those bets should be: first, to make sure that important digital objects are retained with integrity, second to ensure that there is adequate metadata to know what these objects are, and how they must be accessed, and only third to undertake digital preservation interventions.”

 

It may not be necessary to look at digital preservation in hundreds or thousands of years.  What institutions have this timescale?  It may be more useful to look at digital preservation as a series of events or a relay.  Make your decisions on the timescale that you can see and that you have the funding for.  Preserve your objects to the best of your ability and hand them intact on to your successor.  The right approach may be to keep the original bits and then produce access copies as you can.  The high cost of accessing the original may be best given to the user who asks for them. 

 

A restatement of the original issues would be:

   1. Digital preservation is comparatively inexpensive, compared to preservation in the print world,

   2. File formats become obsolete rather more slowly than we thought

   3. Interventions can occur rather infrequently, keeping costs down.

   4. Digital preservation repositories should adjust their timescale to meet their funding and business case, but should be prepared for their succession,

   5. "Internet-age" expectations cannot be met by most digital repositories; and,

   6. Only access versions of the preserved object need be easily and instantly accessible, although the original file and good preservation metadata should be available

 

The lack of money is the biggest obstacle to effective preservation.  Poor decisions will reduce the amount of material that can be preserved. The right choice may be “fewer and better” or “cheaper and more”. 

 

 

---

 

Future-Proofing Web Sites.  Maureen Pennock.  Ariadne.  February 2006.

http://www.ariadne.ac.uk/issue46/dcc-fpw-rpt/

The DCC workshop goal was to provide insight about ensuring ongoing access to web sites over time. This is not just a matter of archiving, but also about how to design and manage a web site so that it is suitable for long-term preservation with minimum intervention.  In one presentation, the key to this is the three R's -Reduce, Replicate and Redirect. Reduce the items to make them easier to preserve, replicate them in multiple formats, and redirect links to the new locations.  It is more ‘future-improving’ rather than ‘future-proofing’.  There need to be selection criteria and guidelines to collect and preserve web sites as part of an organization’s wider preservation strategy.  Standards should be applied preferably at the point of creation rather than a later time.  Persistent identifiers and important, but we should be looking at 15 – 20 years, not longer.  Metadata should  document the technical dependencies and tools; this is more useful than just descriptive metadata. The method of selecting web sites must also be documented. 

 

Some record management principles require the documents to be saved but not necessarily the web site itself.  An organization can therefore determine what needs to be saved, but it may not have to be the entire site.  There should be a clear delineation of tasks and responsibilities.  The National Library of Australia introduced PANDAS 3, a software tool for managing the process of gathering, archiving, and publishing web site resources.  Authenticity is a key issue for web sites.  Preservation management  must include three key aspects: passive preservation; active preservation; and managing multiple manifestations.  Permission should be obtained before archiving web sites.  The main issues were:

 

·         think about the records perspective;

·         reduce, replicate and redirect;

·         protect your domain;

·         be archive-friendly;

·         carry out 'not-bad practice';

·         experiment, and;

·         identify unhelpful practice.

 

---

 

Decision Tree for Selection of Digital Materials for Long-term Retention.  Deborah Woodyard-Robinson.  Digital Preservation Coalition.  March 8, 2006. 

http://www.dpconline.org/graphics/handbook/dec-tree.html

This is an updated version of a decision tree, which is a tool to construct or test such a policy an organization.  The questions and choices in the tree will assist with the decision to accept or reject long-term preservation responsibility.  An effective policy must also be:

·         Endorsed by senior management

·         Actively circulated throughout the organization

·         Reviewed regularly

·         Accompanied by an appropriate resource commitment

        PDF of the Decision Tree (47KB)

 

Friday, March 10, 2006

Weekly readings - 10 March 2006

University researchers develop new digital rights technology.  Jaikumar Vijayan.  Computerworld.  March 10, 2006.

http://www.computerworld.com/securitytopics/security/story/0,10801,109449,00.html?source=NLT_AM&nid=109449

Researchers at the University of Maryland have developed a new digital rights management technology to better protect multimedia content from unauthorized copying and distribution.  The technology embeds a unique ID or fingerprint on individual copies of multimedia content. It is designed to allow owners to trace the content, even if it is pieced together from multiple copies.  This can be applied to images, video, audio and other types of documents. 

 

 

 

Editors' Interview with Victoria Reich, Director, LOCKSS Program.  RLG DigiNews.  February 15, 2006. 

http://www.rlg.org/en/page.php?Page_ID=20894#article1

The LOCKSS (Lots of Copies Keep Stuff Safe) Program offers libraries a cost effective and easy way to build digital collections of Web-based content. Digital information is extremely fragile and preservation must start from the moment it is put into circulation.  Components of LOCKSS include:

·   Replicate the content in independent repositories.

·   Audit the digital content is fragile. If files are continuously compared and damage automatically repaired, off-line back up can be eliminated.

·   A hands-off approach. Minimal processing is needed 

·   Open source software is critical.

·   Allow no single points of failure. Strive for diversity in administration, funding, and technology.

·   Have extremely cost-effective processes.

The LOCKSS system can preserve content in any format available over the Web as long as it has a stable URL structure, and changes at a moderate pace. The single greatest threat to materials being preserved over the long term is money.  Preservation must be accomplished at marginal expense to avoid the threat of scarce economic times.   The CLOCKSS (Controlled Lots of Copies Keep Stuff Safe) Initiative is designed to test the feasibility of a large, community-managed dark archive.   Members are working towards a production system. 

 

“The fundamental goal of a digital preservation system is that the content stored in the system remains accessible to future readers for a time much longer than the lifetime of any individual component of the system.”  Digital preservation is concerned with long timeframes.  All information system components are unreliable in the long run. A fundamental design principle of a digital preservation system is that it must tolerate component failures.  LOCKSS has planned for format migration, obsolescence, scalability, and it’s own possible demise.  While future access can never be proven, it can work to increase the odds that the content will be available in the future.

 

 

 

Hitting the ground running: building New Zealand’s first publicly available institutional repository.  Nigel Stanger, Graham McGregor.  University of Otago.  09 March 2006.

http://eprints.otago.ac.nz/274/01/dp2006-07.pdf  

Institutional repositories are becoming more important.  This low cost and fully functional repository was started and went live in 10 days.  It has received a very large number of hits.  The repository was built with ePrints in three phases: technical implementation, content collection and administration.  They decided to restrict the pilot to voluntary contributions in PDF format their business school.  This effort was publicized to the department heads to get early acceptance, and they quickly received materials, mostly departmental working or discussion papers which already had permission to publish online.  Items with uncertain copyright status were restricted until the status was confirmed. They found the SHERPA website valuable for copyright information.  They decided to follow Dublin Core Metadata to make it compatible with other projects.  They were able to establish the repository quickly because it was a proof of concept and not a large scale project that involved many disciplines or other people. Policy and procedural issues which needed institutional decisions were noted rather than addressed.  They used a minimalist approach to the effort, especially with gathering content. The site which is on the internet has had over 18,000 downloads.  The repository, which contains about 220 items, shows what can be done by a dedicated team.

 

 

 

OpenDoc Prescription a Bitter Pill for Microsoft in Massachusetts. Richard Entlich.  RLG DigiNews.  February 15, 2006. 

http://www.rlg.org/en/page.php?Page_ID=20894#article3

An in depth discussion on Massachusetts requiring the OpenDocument format for new office applications.  The document specified that  “executive branch agencies would be required to migrate office document software to applications able ‘to save office documents by default in the OpenDocument format’ by January 1, 2007 and that ‘any acquisition of new office applications must support the OpenDocument format natively.’ The only other acceptable format mentioned in the document was PDF.”  This informative article also provides a chronology of the events.  It is unknown where this effort will lead, but it has already had an impact as it shows others the need for open formats.  Digital preservation is a process of risk management.  To date, much of the preservation efforts have been reactive, but they need to become more proactive.  These actions may have unforeseen consequences, but without taking a chance we may never know what degree of change is possible.

 

 

 

Friday, March 03, 2006

Weekly readings - 03 March 2006

NEDCC Survey and Colloquium Explore Digitization and Digital Preservation Policies and Practices.  Tom Clareson.  RLG DigiNews.  Feb 15, 2006.

http://www.rlg.org/en/page.php?Page_ID=20894

The Northeast Document Conservation Center conducted an online survey to develop a way to assess institutions’ digital preservation readiness. While digitization efforts are increasing, there are a lack of policies to deal with these materials once they are created.  While a majority of institutions had policies dealing with goals, collection development, and emergency preparedness, few of them address the digital holdings specifically.  IT staff are key to the success of digitization projects.  A majority of institutions are involved in digital imaging projects, and over half provide online searching to the public.

·        29% of respondents had a policy on the creation of digital resources

·        63% said that 5% of their budget or less was devoted to any type of preservation activities.

·        9% had no funds whatsoever allocated for preservation activities.

·        31.1% did not have an IT department

·        92% had created digital assets from physical source materials, mostly from flat paper or images, also books, and AV

·        39% said the majority of the items they consider to belong to digital collections are unique, single-copy works

·        66.9% provide access to digital collections through an institutional website

·        83% had created descriptive metadata for the digital assets to help find and use of digital collections

·        50% at least also created technical and administrative metadata

·        25% do not assign any portion of their budget to create digital collections

·        42% do not have budget lines for acquiring digital collections

·        60%  do not have a specific person assigned responsibility/primary activity for digital preservation

·        84% supported staff development and professional education/training for digital preservation, but it does not seem to translate into policy development

·        30% of collections are not adequately protected by a backup strategy.

·        52.8% said they do not insure their digital holdings, while 36.5% did not know

·        29% of the institutions responding to the survey have policy, planning, or procedure documents on the creation of digital resources

 

Responses to the means of digital preservation included regular data backup, migration, and refreshing the data; maintaining legacy equipment and disks, outsourcing to an externally-managed repository, and emulation. Storage media include network hard drives (78%), or removable magnetic media (65%). Digital collections are most often stored in-house in systems managed by the institution.  A preservation study concluded that “small and medium-sized institutions will need the assistance of experts to assess the preservation status and needs of their expanding digital collections.”

 

 

 

ILM isn’t even clear to it’s OWN user community.  Larry Medina.  Computerworld.  February 17 2006.

http://cwflyris.computerworld.com/t/312105/1109184/9784/0/

Information Lifecycle Management (ILM) is still causing major confusion for many people because there is a question as to what it is. It is being promoted as something new.  Some vendors who push their version of ILM may use scare tactics and mis-information to say that the best  way to accomplish this is to save everything.  While it may be easier right now, it is not the cheapest or best way.  The best way is to evaluate exactly what you need to keep and how long you need to retain it.  Use this knowledge to create a policy and train your organization to manage the information effectively.  It also requires a classification system, and the ability to assign record series to the information.  If the information has a long retention period, you will also need to choose appropriate formats and media.  This method will minimize the information being kept, which will mean shorter indexing times,  faster searches, smaller repositories, quicker backups, lower risks during e-discovery, and a uniform method of managing the information.  Until the practices and responsibilities are defined, “throwing a ‘canned solution’ at an improperly analyzed problem is foolish.”

 

 

 

Microsoft hit with fresh charges over Office, future products.  Simon Taylor.  Computerworld.  February 2, 2006. 

http://www.computerworld.com/governmenttopics/government/legalissues/story/0,10801,108888,00.html?source=NLT_PM&nid=108888

Some of Microsoft’s rivals have charged that it is shutting out competitors.  One of the issues raised is Microsoft's refusal to disclose interoperability information for its Office suite. They are refusing to provide data such as the file formats for .doc, .xls, and .ppt documents, which prevents rival application suites from achieving full compatibility.  This has crucial implications Linux systems. 

 

 

 

Open Document Format (ODF)... let the discussion begin!  Larry Medina.  Computerworld.   March 3, 2006. 

http://www.computerworld.com/blogs/node/1914?NLT_DM_B

In commenting on the Open Document Format discussion, the author cites the Wall Street Journal article: “The data belongs to the people, not to the software vendor that created the file format.”  The most important observation is that a standard controlled by a single company is not a standard. 

 

 

 

A Microsoft Document Fight Brews; IBM, Sun Join New Group Promoting Common Format For Government Records.  Wall Street Journal.  March 3, 2006.

http://online.wsj.com/article/SB114135713113288409.html?mod=opinion_main_com

As alliance of software vendors plan to promote use of the OpenDocument Format, a set of software technologies for storing and creating documents.  The Open Document Alliance includes IBM, Sun, Oracle, ALA, and 30 others.  This heightens the debate over whether governments should adopt software that supports the OpenDocument Format.   Microsoft doesn’t support the format.  Backers argue that the format  is more trustworthy for storing documents because it isn't owned by a single company.  Microsoft is using a new format called OpenXML in its Office 12 software, which is also supported by Apple and Intel.  This is a result of the actions in Massachusetts last year when the state's information-technology division decided to standardize its programs on the OpenDocument Format.