Monday, November 21, 2005

Microsoft to open Office document format

News Story by Simon Taylor and Elizabeth Montalbano
November 21, 2005

"Microsoft Corp. today said it will offer its Word, Excel and PowerPoint document formats as open standards, a move that could spark a war with technology rivals over standard document formats.

Microsoft said it would submit its Office Open XML document format technology to the International Standards Organization (ISO) to be adopted as an international standard in time for the launch of the next version of its Office software suite, code-named Office 12."

Friday, November 04, 2005

Preservation Readings 4 November 2005

Microsoft Launches Book Digitization Project—MSN Book Search. Barbara Quint. Information Today. October 31, 2005.

Microsoft has begun a book digitization project. Initially it will focus on public domain books and rely on the Internet Archive for the digitization. Microsoft plans to expand the content to include academic materials, periodicals, and other resources. They plan to digitize 150,000 books, which would be available in 2006 as part of the MSN Book Search. Eventually, Microsoft plans to work with copyright owners to legally scan materials. They also plan to create a way for a publisher to add content into its system.

Open formats make history - and maintain it. Gervase Markham. Times Online. October 18, 2005.,,9075-1831039,00.html

“Open data formats will be the key to safeguarding tomorrow's historical documents.” Open formats are those that are “made available without restriction.” Closed formats are those that have patents or licensing restrictions, or are undocumented. One problem with closed formats is use restrictions. Another is obsolescence. Open formats give people full control of their data. The Massachusetts decision to require open formats will be seen as the turning point for open formats.

Commission unveils plans for European digital libraries. European Community Press Release. 30 September 2005.

“The European Commission today unveiled its strategy to make Europe’s written and audiovisual heritage available on the Internet.” This will not be an easy task. There is a large volume of materials, in many languages, and of many different types. The three key areas for action are: digitization, online accessibility and digital preservation. These will include an online survey of digitizing and digital preservation issues, collaboration among the members, and research on access and digital preservation.

Open Content Alliance Rises to the Challenge of Google Print. Barbara Quint. Information Today. October 3, 2005.

Google’s efforts to digitize books have spurred others to create their own initiatives. The Open Content Alliance (OCA) has just been announced. This group intends to create an international network of information partners to bring digitized materials to the web. The founding members include the Internet Archive; Yahoo; Hewlett-Packard; Adobe; the University of California; the University of Toronto; the European Archive; the UK National Archives; O’Reilly Media, Inc.; and Prelinger Archives. The Internet Archives will host the permanent repository. The principle is that all content will be open to all search engines. Content will be in pdf and other accepted formats. They hope to work with libraries and commercial sources and resolve legal or other issues. It will only add information that is either public or for which they have approval. The content will be more “library-like, as opposed to an archive.” They do not claim to have all the answers, but this provides a place where the parties can work together to find solutions. Digitizing is sharing with others. Main website is at:

Open Content Alliance Expands Rapidly; Reveals Operational Details. Barbara Quint. Information Today. October 31, 2005.

The Open Content Alliance has added many new members to the Open Library project, including universities, Microsoft , RLG, and others. These members have committed to donating services, facilities, tools, and/or funding. RLG will be contributing bibliographic metadata. The technology to create this has existed for years, but “the money, the labor, and the legal problems are the touchy part.” The interface models a book with page turning software. They hope publishers will realize that “proprietary control over content is an impediment to commerce.”

Friday, October 28, 2005

Preservation Readings 28 October 2005

Japanese holographic storage firm to ship 200GB drives in '06. Lucas Mearian. Computerworld. October 24, 2005.,10801,105682,00.html?source=NLT_AM&nid=105682

Optware Corporation is planning to ship three versions of its product by the end of next year, with up to a 200 GB . They expect to release a 1 TB disk by the end of 2008. A holographic disk can store more information by storing data inside the disk as well as on the surface. The cost of the disk is less than a hard drive. “Both Optware and InPhase are targeting their initial products at the data archival market because their holographic disk technology is removable and can be kept for decades without deterioration of data, which is stored within the disk and not on the surface.” Optware also plans to release a holographic disk product for streaming video, and a consumer disk about the size of a credit card that can hold 30GB.

Companies hope to extend open-source movement to data storage. Brian Bergstein. Detroit News Technology. October 25, 2005.

IBM is leading a group of companies that hope to extend the open-source movement to data storage. Generally each storage system has its own management software. The group hopes to develop new data-management software that would be open source and allow data to be moved seamlessly within their organization.

Cheap DLT on the way. Martin MC Brown. Computerworld. October 17, 2005.

A 1.6TB SuperDLT tape is still under development but should appear by the end of this year. The gap between successive tape generations will increase.

Cheap DLT pitched to outpace DAT. Bryan Betts. The Register. 17 October 2005.

Hard drives have been increasing in size quite rapidly, but the tapes for backup haven’t, so there are more tapes needed to backup a drive, which means slower and more expensive processes. Quantum has announced a 320 GB tape (with compression).

Robots and sensors are in IT's future, Gartner says. Patrick Thibodeau. Computerworld. October 20, 2005.,10801,105590,00.html?source=NLT_HW2&nid=105590

Hewlett-Packard has discussed "lights-out" or "humanless" data centers at a recent conference. They believe that management technologies will lead to fully automated data centers in the years ahead. The Gartner research group has predicted that as many as half of all hands-on data center jobs may disappear over the next two decades because of automation. But the concept of a fully automated data center is still hard for some IT mangers to accept.

Science and technology based institu[t]e in Chennai stress on digital libraries. Digital Opportunity Channel. October 24, 2005.

"Digital libraries are the only solution to the problem of pages disappearing from library books." The paper Preservation of Electronic Theses and Dissertations: A case study of SRM Institute of Science and Technology was presented at 8th International Symposium of Electronic Theses and Dissertations.

The full paper is at

Friday, October 21, 2005

Preservation Readings 21 October 2005

Urgent Action Needed to Preserve Scholarly Electronic Journals. Donald J. Waters. Association of Research Libraries. October 15, 2005

Digital preservation is a major challenge facing higher education. Yet organizations have been slow to invest in the infrastructure to maintain electronic journals and files over the long-term. The industry is shifting to electronic resources and print resources are being scaled back or canceled. Because licensed journals are being used, there is no local copy that is being retained. Four actions are essential:

1. Preservation of electronic journals is a kind of insurance, and not just access

2. Qualified preservation archives should provide a well-defined minimal set of services.

3. Libraries must invest in a qualified archiving solution.

4. Libraries must demand archival deposit by publishers as a condition of licensing electronic journals

The publishers archiving methods must be described publicly.

Gates cheers on computer museum. BBC News. 17 October 2005.

Bill Gates has pledged $15 million to the Computer history museum in California. The museum displays the history of computing as well as the impact. The museum currently houses a collection of more than 4,000 artifacts, 10,000 images, 4,000 linear feet of catalogued documentation and many gigabytes of software. "It's our responsibility to collect the artefacts and stories today that will explain this incredible change to future generations."

Caring for your collections: Cylinder, Disc and Tape Care in a Nutshell. Library of Congress. 7 October 2005.

This is part of the Library of Congress preservation web site. It contains very good information on topics such as handling, storage, packaging, equipment, and supply sources. This is more for analog materials. It does have some information on tapes and cassettes. The second site gives information on the principles and specifications for preservation digital reformatting. Some of the principles include:

· Retain an analog version of digitally-reformatted items until you are confident that the life-cycle management of digital data will ensure access for as long as, or longer than, the analog version.

· Minimize handling of originals in the digital reformatting work to assure the best digital capture of an undamaged original, as well as the longevity of the original item

· Ensure that the digital master file will allow a broad range of future use

· Capture the highest quality digital image technically possible and economically feasible for large-scale production, while optimizing the potential for longevity

· Archive a digital master file that is free of, or minimizes, artifacts introduced by the reformatting process, whenever possible

· Employ standards and best practices for structural, administrative, and descriptive metadata that will optimize interoperability

· Document digital master file contents with MD5 checksums (or a similar tool) and use them to ensure the data integrity of master files through back-up and migration

Video Format Identification Guide. Website. 2005.

This is a useful site to help archivists, librarians, curators and conservators identify the videotapes in their collections. The site has formats broken down by time period: 1956-1970; 1970-1985: 1985 to present. Each format type has an image and brief information about it, as well as an obsolescence designation: Extinct; Critically endangered; Endangered; Threatened; Vulnerable; or Lower risk. The site also contains an explanation of video terms.

Digital Preservation Topics in Google Groups. October 2005.

An interesting discussion of many topics dealing with audio preservation. Includes information on software for recording records and cassettes, to equipment, to format and media challenges for preserving CDs and other digital files.

Friday, October 14, 2005

Preservation Readings 14 October 2005

The Future of E-Mail Archiving. Jennifer LeClaire. TechNewsWorld. October 13, 2005.

Recent high-profile scandals illustrate the importance of email and the consequences of misuse. Recently 20% of employers have had email records subpoenaed, and 13% have fought lawsuits that were triggered by email. Email and other records are the “electronic equivalent of DNA evidence." Email archiving is growing rapidly and there is a great demand to have system administrators address the volume of email sent and stored. Email archiving must consider policies and the decision points that turn into policy. They want to make decisions based on long-term objectives and how the policies will fit into the operational model, “which includes policies for backup, restoring, disaster recovery, business continuity, security, flexibility and scalability.” "Archiving is a new concept, [!] and its growth has been fueled by new technologies that assist IT users in implementing this valuable strategy." They are looking to improve the intelligence of the archiving and retention functions, and to find ways to use the information effectively.

Ground Broken for New Church History Library. Press Release. 7 October 2005.,5422,116-22297,00.html

A new Church History Library is being built in downtown Salt Lake City by the Church of Jesus Christ of Latter-day Saints. The library will incorporate updated technology and will significantly increase archival storage capacity to preserve various types of materials, including print materials, manuscripts, photographs, microfilm, audiovisual items and others. They have consulted with international experts in records preservation and archival design to ensure it has the best lighting, humidity and temperature controls, as well as fire and seismic protection.,4945,40-1-3227-4,00.html

Archivists are already addressing the issues of transitioning to handling the digital materials. “Documents that are digitized and made available online are handled less frequently, extending the life of the original document. Creating digital documents isn't without challenges. Every 10 years advancing technology dictates that digitized documents be moved to a more current electronic medium.”

Holograph? Schmolograph... Larry Medina. Computerworld. October 4, 2005.

Concerns about holographic storage and its permanence. So far there has been no information about the permanence of holographic storage. Is there any information about accelerated aging tests? The Norsam technology was long term and stable; it may be time to look at this technology again. The point is that there is no standard for these new technologies.

More Eggs in One Basket: Will Blu-ray and HD-DVD Be Archival? D.W. Leitner. Video Systems. Oct 13, 2005.

There has been a lot of interest in the longevity of CDs and DVDs and the suitability for archiving. How do HD DVD and Blu-ray fit into this. Both have higher density than DVDs. HD DVD uses the same construction as DVDs, with the data layer between polycarbonate layers. With Blu-ray, the data layer is on the disc surface closest to the laser, with only a 0.1 mm protective coating, avoiding reading through thicker layers, which could cause optical distortion of the laser. With CDs, the data layer is near the surface on the opposite side of the laser. Disk scratches would be a concern to archivists, but professional versions would have a cartridge for protection (which means two different Blu-ray drives). The higher data densities of discs is a concern if one goes bad. And the holographic discs are even higher density than Blu-ray.

Thursday, October 13, 2005

Ingest Guide for University Electronic Records

"One of the key challenges to preserving electronic records in a meaningful way is preserving the authenticity and integrity of records during their movement from a recordkeeping system to a preservation system. This Ingest Guide describes the actions needed for a trustworthy ingest process. This process enables an Archive and Producer to move records from a recordkeeping system to a preservation system in a manner that allows a presumption of authenticity."

ProQuest Creates Digital Archive of British Periodicals

"ProQuest Information and Learning will digitize nearly 6 million pages of British periodicals from the seventeenth, eighteenth, nineteenth and early twentieth centuries, creating direct access for humanities scholars to the breadth of texts that captured both daily life and landmark thought of the time."

The Future of E-Mail Archiving

"E-mail archiving as an industry is growing rapidly and it is interesting to examine the underlying trends driving the growth," Nick Mehta, senior director of product development for Symantec, told TechNewsWorld.

"With regard to storage, Geis said companies want to make technology decisions based on what is going to last for the long haul. It is not just a simple matter of technology, he said, but how it will fit into the company's operational model, which includes policies for backup, restoring, disaster recovery, business continuity, security, flexibility and scalability."

Researchers to develop China-only version of HD-DVD

SEPTEMBER 20, 2005 (IDG NEWS SERVICE) - BEIJING -- In a bid to cut costs for local electronics makers, Chinese researchers plan to develop a version of the next-generation HD-DVD optical disc format specifically for China that will include support for a locally developed video compression technology, called AVS (Audio Video Coding Standard), according to a researcher involved with the project.

Automated Video Tape Preservation

LIBRARY OF CONGRESS SELECTS Automated VIDEOTape Preservation AND Digitization System FOR Audio-VIDEO PROJECT

SAMMA to Migrate Library’s Audio-Visual Collection

Washington DC – October 12, 2005 – The Library of Congress has contracted to purchase the System for the Automated Migration of Media Archives, or SAMMA, to migrate its massive collection of audio-visual material in preparation for its move to the National Audio-Visual Conservation Center in Culpeper, VA. Over the next several years, the Library will use SAMMA to migrate and digitize many of the hundreds of thousands of recordings in its collection.

The Library realized that it would take many decades and be prohibitively expensive to migrate and digitize the audio-visual collections manually. To have the material available at the Culpeper facility when it opens in 2007, a more practical, cost-effective, and efficient method had to be found. In examining the alternatives, the Library concluded that Media Matters’ innovative migration automation system would provide the high quality necessary to preserve the recordings, while meeting the required cost and time restraints.

SAMMA combines robotic tape handling systems with proprietary tape cleaning and signal analysis technologies. SAMMA’s expert system automatically supervises quality control of each media item’s migration. From a thorough examination of the physical tape for damage, to real-time monitoring of video and audio signal parameters during migration, SAMMA ensures that magnetic media is migrated with the highest degree of confidence and the least amount of human intervention. SAMMA uses specially-designed components to gather technical metadata throughout the entire migration process, ensuring that the process is documented in depth while gathering important metrics about the health of an entire collection. The modular, portable system will be installed on-site at the Library and run 24/7. The final product will be a lossless compressed Motion JPEG 2000 digital file copy of each master tape at preservation quality, and the technical metadata describing the condition of the media item and the migration process.

Upon completion, the National Audio-Visual Conservation Center of the Library of Congress will be the first centralized facility in America especially planned and designed for the acquisition, cataloging, storage and preservation of the nation’s heritage collections of moving images and recorded sounds. It is expected to be the largest facility of its kind in the world. The NAVCC, funded by the Packard Humanities Institute and the U.S. Congress, will open fully in 2007.

About Media Matters LLC

Media Matters LLC has extensive expertise with magnetic media migration, and is dedicated to taking traditional migration strategies into the 21st century by researching, developing, and deploying cutting-edge digital media technology. Media Matters recently completed the development hardware for creating real-time Motion JPEG 2000 files with synchronized uncompressed audio files. As the exclusive American partner in the EU’s PrestoSpace consortium, and through involvement in other international organizations, Media Matters is developing next-generation processes and standards for automated media migration.

For further information:

Contact Steve Kwartek at

Phone: 212-268-5528 X113

Monday, October 10, 2005

Digitizing Old Photos

Short but interesting article. Here is a good quote:

"Once you have your photographs digitized, make extra copies of those files on quality CDs and/or DVDs. Then store the extra copies in a safe deposit box, with family and/or with friends. Compared to the cost of losing those precious memories, the cost of digitizing them and making extra discs for storage is a small price to pay."

Click on the title above or use this link:

Friday, October 07, 2005

Preservation Readings 7 October 2005

New ISO standard will ensure long life for PDF documents. ISO Press Release. 7 October 2005. [Updated link, August 7, 2015.]
The PDF and archival PDF file formats have been approved as ISO standards. The standard “enables organizations to archive documents electronically in a way that will ensure the preservation of content and visual appearance over an extended period of time. It also allows documents to be retrieved and rendered with a consistent and predictable result in the future, independent of the tools and systems used for creating, storing and rendering the files.” This will have a significant impact on the digital preservation community. It will allow documents to be delivered in a standard way for a long time. "PDF/A files will be more self-contained, self-describing, device-independent than generic PDF 1.4 files, and should allow information to be retained longer as PDF." It is estimated that over 9% of the surface web consists of PDF documents. The current standard is ISO 19005, Document management – Electronic document file format for long-term preservation – Part 1, Use of PDF 1.4 (PDF/A-1). Future updates will provide compatibility with additional changes to the PDF specification, but will still standards and applications. An announcement from AIIM and NPES The Association for Suppliers of Printing, Publishing and Converting Technologies is at: Click here

Digital History: A Guide to Gathering, Preserving, And Presenting the Past on the Web. Daniel J. Cohen, Roy Rosenzweig. University of Pennsylvania Press. 2005.
This website has a free online version of the book. It looks at the qualities of digital media and networks that potentially allow us to do things better: capacity, accessibility, flexibility, diversity, manipulability, interactivity, and hypertextuality, as well as the hazards of quality, durability, readability, passivity, and inaccessibility. “One vision of the digital future involves the preservation of everything—the dream of the complete historical record. The current reality, however, is closer to the reverse of that—we are rapidly losing the digital present that is being created because no one has worked out a means of preserving it.”
One chapter specifically deals with digital preservation, with the fragility of digital materials, technical considerations, websites, selection of materials, and the future of digital materials. Future preservation should be a part of the planning of any digital project. Readers may now understand that digital preservation may require as much work or more than preserving paper. Any web project requiring a great deal of time to produce also needs a great deal of time to preserve. “It would be a shame to ‘print’ your website on the digital equivalent of the acidic paper.”
The Library of Congress estimates that possibly as much as 10% of their disc collection already contain serious data errors. “No acceptable methods exist today to preserve complex digital objects that contain combinations of text, data, images, audio, and video and that require specific software applications for reuse.”
Archivists who have studied the problem of constant technological change, have realized that “the ultimate solution to digital preservation will come less from specific hardware and software than from methods and procedures related to the continual stewardship of these resources.” The book talks about various technologies and software, such as DSpace and Fedora. “Because digital copies are so cheap, it does not hurt to have copies of digital documents and images in a variety of formats; if you are lucky, one or more will be readable in the distant future.” Backups of files is not preservation.. Preservation also involves dealing with the technological changes. Digitization is not preservation, because currently digital copies cannot be perfect copies of analog materials. But digitization may be the best solution in some cases. “Digital preservation is here to stay.” It is not the total answer, but it is another tool to use. “For now, you are the best preserver of your own materials.” Backup your work and create good documentation.

Microsoft says Office beta coming in November. Ina Fried. CNET October 3, 2005.
Microsoft to support PDF in Office 12. Martin LaMonica. . CNET October 3, 2005.
Microsoft has been under pressure to provide open formats for Office. It has announced that the next version of Office (version 12 due in the second half of 2006), will provide support for the PDF format: it will let users convert an Office document to PDF, but PDF files are not readable within Office applications. The Microsoft XML-based document format will be the default setting. Office 12 does not support OpenDocument. Windows Vista will have a format, called Metro, that will offer features similar to PDF. Microsoft has said that they have been getting 120,000 requests a month for PDF support. Office currently supports rtf and html formats.

Seagate exec: Hard disks anything but obsolete. Martyn Williams. Computerworld. October 5, 2005.
“Hard disk drive technology is anything but dead and isn't in danger of being replaced by memory chips anytime soon” said a Seagate executive in response to a Samsung announcement. This may be more of a reflection of the battle for the storage market.
A new way to stop digital decay. The Economist. September 15, 2005
The digital documents of today face a serious threat, the threat of disappearing. Even simple files may not be readable in the future if the software or hardware needed to read it is obsolete. One strategy is to migrate copies to new hardware and software, but that may be difficult, and may also have problems. The National Library of the Netherlands is exploring the possibility of a Universal Virtual Computer that is being developed by IBM. It will have the ability to run programs that can read different file formats. In the future, libraries will have to write software that emulates the virtual computer on each new generation of computer systems. But when that is done, the programs will be able to read the documents using the decoding programs that can be written and tested today, while the format is still readable. Decoding programs have been written for jpeg and gif, and the PDF format will be added.
Descriptive metadata for copyright status. Karen Coyle. First Monday. 3 October 2005.
One of the main characteristics of digital materials is that they can be reproduced easily. This has caused a near crisis in terms of intellectual property rights because of the networked world. Two approaches to resolve the problem have been to 1) change the copyright law, and 2) protect the digital format. This paper tries to define the metadata needed to provide the copyright information that determines the use of the item. The metadata must be able to capture copyright status and to assert what copyright information is unknown. It must also be able to provide contact information for those who need more information. Currently there is little copyright information in a MARC record. Besides typical information, the metadata needs metadata needs elements to show the copyright information taken from the piece only, and additional research undertaken to determine the copyright if it is unknown. Adding copyright information is a burden for those who create the metadata; the lack of information though creates an even larger burden for those who would like to use the material. “Copyright–related metadata, therefore, should be seen as an essential component of the resource description.” This should be kept with the work itself.

Yahoo Works With 2 Academic Libraries and Other Archives on Project to Digitize Collections. Scott Carlson, Jeffrey Young. The Chronicle of Higher Education. October 3, 2005.
Yahoo will be working a number of partners to digitize millions of volumes. These include the University of California, the University of Toronto, the Internet Archive, Adobe, the European Archive, the National Archives of England, O'Reilly Media, and Hewlett Packard Labs. The project will not include copyrighted books, unless they have permission. The texts will be available to be searched by other search engines as well as Yahoo. The project is modeled on open source software projects. The Internet Archive has been working on a pilot project with the University of Toronto for about a year. So far, about 2,000 books have been scanned.

Friday, September 30, 2005

Preservation Matters

This blog has been created to post information about preservation of materials, particularly digital materials. Anyone is welcome to post preservation matters to it.

Much of the information that I post here comes from internet resources that I read and summarize. Other comments and resources are welcomed.

The name "Preservation Matters" comes from a newsletter that I publish.

Sept 30, 2005

The Value Proposition in Institutional Repositories. Erv Blythe and Vinod Chachra. EDUCAUSE Review. September 2005.

Institutional repositories must have institutional organization, coordination, and investment. However they will be successful only when the individuals in the community participate. Institutional repositories are a “managed collection of digital objects, institutional in scope, with consistent data and metadata structures for similar objects, enabling resource discovery.” They need to allow reading, upload, exporting, and resource sharing. The repository focuses on developing, enhancing, and protecting the value in the creative output of the members of the sponsoring institution. They need to have a broad scope. There needs to be a critical mass to be successful. They are valuable to an institution by housing the items together, allowing interconnections, archiving and preservation. The value to individuals is through sharing resources for research and teaching. But the value is also a function of low cost, as measured by their time and effort.

Glossary: Image Terminology and Acronyms. TASI. September 28, 2005.

TASI has just released an updated glossary of image terminology and acronyms. It also includes technical terms and acronyms related to digital imaging, which provide explanations and additional links if needed. Some examples are:

Archival Image A digital image taken at the highest practicable resolution and stored securely

Digital Preservation The Arts and Humanities Data Service (AHDS) describes digital preservation as "the preservation of digital materials and to the preservation of paper based materials and other artefacts through their digitisation"

Image Archive Collection of images kept in secure storage

Artefacts A term used to denote unwanted blemishes, which may have been introduced to an image by electrical noise during scanning or during compression

Oldies, Music Rights, and the Digital Age. Peter McDonald. EDUCAUSE Review. September 2005.

The recording industry looks at current sales yet looks to national sound archives to preserve the music for the future. “At almost every turn, the industry has stymied the legitimate efforts of recorded sound archives to provide digital preservation of and access to their vast collections of “oldies” (recordings from 1890 to the 1950s).” Archives can let users listen to recordings, and in certain conditions, provide fair use “copies” on an item-by-item basis. But the archives are prohibited from creating digital repositories of commercial audio files. The archives are all about access, whereas the recording industry is all about revenue. The two groups need to find a way to work together.

Toshiba First in World to Develop Notebook PC with HD DVD-ROM Drive. Press Release. 27 September, 2005.

Toshiba has introduced a laptop with high-definition imaging in the world’s first notebook PC with a slim-type HD DVD-ROM drive. It will be commercially available next year. The height of the drive is less than 13 mm. It has a single optical lens that can read HD DVD discs and read and write to standard DVD and CD. It also comes with a high resolution LCD display.

StoneD: A Bridge between Greenstone and DSpace. Ian H. Witten, et al. D-Lib Magazine. September 2005.

This article compares Greenstone with DSpace, the similarities and differences. They present StoneD which is a bridge between the two systems that allows data to migrate between the two or be used in combination. The two systems have different goals and strengths, though the can both build digital collections. Greenstone is primarily for building and distributing collections, mostly on the web. DSpace is for self depositing institutional repositories and preservation of information. StoneD allows for data to be imported and exported between the two systems to take advantage of the strengths of each.

USB Flash Drives - You CAN Take It with You. Imation website. 2005.

· Flash memory has a write endurance limit. This limit is the number of times the flash memory cell can be written until it can not be restored to its initial condition. The industry refers to this as the erase cycles. The endurance is rated between 10,000 and 100,000 erase cycles for different types flash memories.

Flash SSDs - Inferior Technology or Closet Superstar? Kelly Cash. BiTMICRO Networks. 2005.

· … flash memory chips have a limited lifespan. Further, different flash chips have a different number of write cycles before errors start to occur. Flash chips with 300,000 write cycles are common, and currently the best flash chips are rated at 1,000,000 write cycles per block (with 8,000 blocks per chip). Now, just because a flash chip has a given write cycle rating, it doesn't mean that the chip will self-destruct as soon as that threshold is reached. It means that a flash chip with a 1 million Erase/Write endurance threshold limit will have only 0.02 percent of the sample population turn into a bad block when the write threshold is reached for that block.

· With usage patterns of writing gigabytes per day, each flash-based SSD [solid-state disks] should last hundreds of years, depending on capacity.

Friday, September 23, 2005

Sept 23, 2005

The digital Dark Age. The Sydney Morning Herald. September 23, 2005.

Article about digital preservation, the possibility of losing digital information, and the digital dark age. The computer is the most dramatic record keeping system since the invention of printing. The concern over the obsolescence of hardware and software systems, which may cause a problem in reading digital information. Emulators may help read the information. The State Records Authority of New South Wales, Australia, has created a strategy for preserving the information, called Future Proof. It includes conservation, conversion, and migration of information, plus retaining the original equipment (which is a major effort with considerable problems). The answer may lie in a combination of solutions, including keeping a hard copy of the information.

More information about Future Proof is available at their website:

The ten strategies are:

1. Take a planned approach

2. Build partnerships

3. Build recordkeeping systems

4. Use recordkeeping metadata

5. Move records through new formats, media & systems

6. Manage the media

7. Use technical standards

8. Practice data management

9. Retain equipment/technology

10. Use viewer/player technology

Copy Your Digital Photos Onto Film. Mark Goldstein. PhotographyBLOG . September 19, 2005.

Press release about a laboratory that copies digital images to film. Copying to the latest standards is difficult, and current media may not be readable in the future because of hardware or software problems. With this system, “the picture is systematically reproduced in colour and resolution to the analogue image.” The capability of the recorder is 11 million pixels; the customer can send their images to the lab for copying.

[The blog responses at the end are interesting to read as they discuss digital preservation.]

Toshiba Develops 30Gb Dual-Layer HD DVD-R Discs. September 21, 2005.

Toshiba announced a 30GB dual-layer HD DVD-R (recordable) disc which extends the capacity of optical discs. The disc is based on the same structure as current DVDs, with bonding of two layers, organic dye, and a spin-coating process for spreading the dye on the discs. The manufacturers will start tests next month to verify the disc compatibility. They hope to finalize the specifications by year end.

Too Much ETL Signals Poor Data Management. Ken Karacsony. Computerworld. September 5, 2005.,10801,104330,00.html?source=NLT_DM&nid=104330

When a system uses extensive extract, transform and load (ETL) processes, it is a symptom of poorly managed data and a poorly developed data strategy. IT staff are maintaining more databases that recreate or move data between systems. Much of this is redundant. The best way is to create a single, sharable database for each major area and design the database to meet the needs of its users. Since information is a organizational asset, it doesn’t belong to just one group or department. So databases must be designed for both the producers and consumers of the data. The entire organization must be involved in defining the relationships and attributes. “The database, and not the application, is the center of the universe.”

Friday, September 16, 2005

Sept 16, 2005

Technology Watch Report: Preservation Metadata. Brian Lavoie. Digital Preservation Coalition. September 2005.

Preservation metadata is the information that supports and documents the long-term preservation of digital materials, especially:

· Provenance: Origin and history, and chain of custody

· Authenticity: The document is what it is supposed to be and has not been altered

· Preservation activity: What was done to preserve the item and what the effects were

· Technical environment: The hardware or software to read the document

· Rights management: Any limitations on preserving or accessing the materials

It makes the archive self documenting. The metadata will accumulate over time. Automated tools are needed for preservation metadata to keep costs from rising to prohibitive levels. We must be able to distinguish preservation metadata from other types. “Preservation metadata is descriptive, structural, and

administrative metadata that supports the long-term preservation of digital materials.” Preservation metadata is important because digital items are technology dependant, they are easily altered, and they are bound by intellectual property rights. There is often a brief window of opportunity in which to act. Digital preservation activities are often to avert damage before it happens, rather than repair it later. It is difficult to anticipate what metadata will be needed over time. Preservation metadata requires we “get it right” the first time.

A preservation metadata schema must be comprehensive, oriented toward implementation, and interoperable. Metadata plays an important role in preserving content long term and using it. The OAIS model is based on the information packet and establishes preservation metadata. PREMIS helps relate the theory and practice of preservation metadata. METS, a metadata standard, is an XML based structure that can store the metadata, either internally in the METS file, or referenced externally. It is cheaper and more efficient to collect metadata on an item when it is most readily available. We need to explore collaborative methods of gathering and sharing metadata. Resources need to be continually tested and refined.

Preserving The Archive - A Race Against Time. WAMU radio interview. August 19, 2005. [Audio]

Interview with Michael Taft, Head of the Archive of Folk Culture at the Library of Congress and Matthew Barton, an Audio Engineer. They have mountains of material to deal with. As technology progresses, they can do more with some of these recordings. They have about 100,000 audio items in the collection and possibly only transferred about 5%. Digital transfer is about preserving these items. None of these media were meant to last forever, and most of the media used were not for professional use. It is a race against time to preserve the items before they are lost. A CD is just another medium that will deteriorate over time, and when it goes it really goes, not like a wax cylinder that you can still listen to as it degrades. We don’t throw out the original, because there may be new ways of getting the recordings off the media, such as taking a digital image of a broken phonograph album and being able to recreate the music.

Expanding the Stage for Political Theater. Jerome McDonough. Bija Gutoff. Apple. September 2005.

Description of a project to “to preserve and make available on a global basis” these cultural documents. The number of scholars whose work depends on video documentation is increasing. “But videos don’t last very long. Without the digital library, these performances are not only inaccessible for study, but they’re in danger of disintegrating.” “We want to preserve these materials for the long term — and in the library world, long term means 300 to 500 years!”

They try to capture performances as 4:2:2 uncompressed 10-bit files, but the large files must be compressed to put on DVD or the Internet. They hope to create uncompressed masters on hard drives, and move away from DigiBeta. “When you throw out color information during sampling, you’re using lossy compression; then, each time you change formats, you introduce artifacts that can damage the video stream. But if we capture the complete video signal, we’ll be able to migrate to new formats without having to worry about introducing artifacts.” “Having access to these performances is vital to scholars who want to achieve a deeper understanding of the cultural and political life of the Americas.” “By bringing all this material together in one place, making it publicly available and ensuring that it will live on and be available in the future, our library is making a real contribution to scholarship in the world.

Web ARchive Access (WERA). Website. Nordic Web Archive. August 31, 2005.

WERA is a tool for searching and displaying archived collections of web documents. The documents can also include different versions of the same document. An overview shows the dates of the various versions. The archived web documents are stored in ARC files. The tool is freely available for download.

Samsung Predicts End Of Hard Drives. Chris Mellor. Computerworld. September 13, 2005.,4902,104582,00.html?nlid=HW2

Samsung has just created a 16GB flash chip, and expects that computer hard drives will be replaced by solid-state flash memory. Flash memory continues to double its density about every 12 months. Laptop memory cards with 32GB of memory should be available in 2006 or 2007. Here is a link with more information on flash memory and how it works.

Friday, September 09, 2005

Sept 9, 2005

National Archives Names Lockheed Martin to Build Archives of the Future. NARA Press Release. September 8, 2005.

NARA has awarded a $308 million, six year contract to Lockheed Martin to build the Electronic Records Archives. The system will “capture and preserve the electronic records of the federal government, regardless of format, ensure hardware and software independence, and provide access to the American public and Federal officials.” This comes after a year-long competition between two firms.

The CPU's next 20 years. Tom Yager. InfoWorld. September 07, 2005.

The Intel Itanium computer processor is not like the current computing processors. It is seen as incompatible with everything else. The road ahead isn’t about hardware at all. It will be about development suites and tools that can optimize an application based on changing environments. We’ll end up with a naturally occurring matrix of CPU types and deployment patterns that provides customers with meaningful choices.

LDS Church to put microfilmed records online. Daily Herald. September 10, 2005.

The LDS Church announced plans to digitize and index the more than 2 million rolls of microfilmed genealogical records which are stored in granite vaults near Salt Lake City. "Currently, you have to look at images on paper or burn them on a CD and distribute those to index the data. We're moving the whole process to the Internet."

DSpace Federation 2nd User Group meeting. Conference held on 6 July 2005. 9 September 2005.

The program from the DSpace user group meeting is now online. The website includes descriptions of the presentations and the PowerPoint slides; some include Word or PDF documents as well. The presentations include:

· The Australian National University: Case Study: Creating Publications from a DSpace Repository

· Using Multiple Metadata Formats in DSpace

· Exploring Strategies for Digital Preservation for DSpace@Cambridge

· Expanding the Focus of the IR: Scholars' Bank at the University of Oregon

· Introduction: Incorporating local developments to DSpace

· Conversion and metadata extraction frameworks

· Use of DSpace as an audiovisual archive

Sony goes 8x for Double Layer DVD+R writing. Kelly Ellis. PC Pro. September 6, 2005.

Sony has introduced a double layer DVD burner that can burn the 8.5 GB discs at 8x speed. It can also burn many other DVD formats and speeds, and also CDs.

Random Musings on Apple’s iPod Nano. Harry McCracken. PC World. September 07, 2005

“If flash memory was as cheap per gigabyte as hard-disk space, and available in disk-like capacities, the hard drive might go away. That’s not going to happen anytime soon, but I suspect that flash memory will start to replace drive storage in some devices over the next few years, resulting in smaller, more reliable products.”

Friday, September 02, 2005

Sept 2, 2005

An Audit Checklist for the Certification of Trusted Digital Repositories. RLG. August 2005.

This 70 page document is for those who are responsible for digital repository certification and for those who will carry out the process. The requirements touch every part of the repository and the institution. The analysis of the functions and requirements of the repository can help assure the repository is operating according to best practices. The document relies on the “Trusted Repository” and OAIS documents. This draft is for public comment. The document outlines the audit and certification process, the criteria to be used, a checklist, and a glossary of terms. This is a very thorough method of certifying that the repository and the organization exist and are following the standard practices as outline in other documents and international standards.

The audit & certification criteria are organized as follows:

· Organization. Making sure the organization is viable, that it has the appropriate staff and structure; that there is accountability for actions; that it is financially sustainable. The organizational attributes are just as important as the technical. It must follow prevailing standards, policies, and practices. Ongoing training is important. Repository review processes should be annual. The appropriate contracts, agreements, and licenses must be in place to detail the rights, responsibilities and expectations of those involved.

· Repository function. The processes and procedures exist to ingest, manage and provide access to digital materials for the long-term. There are minimal conditions for the preservation of the information packages. Documented and demonstrated strategies must be in place. Metadata must allow the items to be located and managed. The digital objects accepted by a repository for preservation should reflect both its mission statement and the interests of the designated community, and the relationships must be clearly understood. Complete documentation is needed, which may include metadata, codes, sample forms, record layouts, explanations, minimum and maximum values, and related studies and results. The repository must know what will be preserved for each object. It must verify and authenticate each object, and monitor the integrity of the items. Every item must have descriptive information, and the method of getting it needs to be documented. The minimum requirements may be very basic. Access should always deliver what is requested, or else a reason why it is not possible.

· The designated community. The users should be identified, and they must be able to understand the information. The information returned must be useable. The understandability and usability should also be verified.

· Technologies & technical infrastructure. The technical aspects are not prescribed, but good computing practices are required. This practices certification looks at general system infrastructure requirements; the use of technologies and strategies appropriate to the community; and security. Security can refer to the environment, data, systems, personnel, physical plant, security needs, etc. The disaster plan should be tested regularly.

Plasmon Launches Compliant Write Once UDO Media. Computer Technology Review. August 30, 2005.

Plasmon has launched the new UDO (Ultra Density Optical) Compliant Write Once media. It is designed for archive applications that are subject to regulatory compliance and for Information Lifecycle Management environments. The media combines Write Once authenticity as well as the ability to physically destroy records on the media according to data retention and disposition regulations. This hybrid media is in addition to Rewriteable media and also Write Once media.

Hitachi Unveils World's First Terabyte DVD Recorder. Reuters. eWeek. August 24, 2005.,1895,1851795,00.asp

Hitachi has unveiled the world's first hard disk drive/DVD recorder that can store one terabyte of data. The primary target market is for digital broadcasting. The recorders will go on sale next month in Japan.

Indian classical treasure-trove goes digital. Fakir Balaji. Hindustan Times. August 30, 2005.,00040006.htm

A project between Carnegie Mellon University and the Indian government will digitize a million rare manuscripts, palm leaves, copper plates and age-old classical literature. About 130,000 documents have been scanned in31 digital centers across the country. The target is to reach about a million by 2008. The documents are brought to the centers, digitized, and returned to the owners. They intend to offer full text searching. Font recognition is a problem for the optical character recognition. The site is available at

Sun Starts Digital Rights Project. Tom Sanders. Forbes. August 23, 2005.

Sun Microsystems intends to create an open and free digital rights management (DRM) technology, which ensures access to digital content for legitimate users but blocks use that violates copyright licenses. Other companies have created similar systems. The large number of DRM systems and the incompatibilities are causing problems. By creating the open technology, they feel they can set an industry standard. "We fundamentally believe that a federated DRM solution must be built by the community, for the community."