- The repositories lack the technical, organization and financial support needed to preserve materials.
- The deposit agreements do not necessarily convey the preservation rights needed.
This blog contains information related to digital preservation, long term access, digital archiving, digital curation, institutional repositories, and digital or electronic records management. These are my notes on what I have read or been working on. I enjoyed learning about Digital Preservation but have since retired and I am no longer updating the blog.
Friday, April 20, 2007
Weekly readings - 20 April 2007
Friday, April 13, 2007
Weekly readings - 13 April 2007
The conference presented sessions on DSpace, Fedora, and Eprints, including user groups for each software. Open source software may be free, but does not mean “no cost”, it brings maintenance costs. Choose the right partners to create a competitive advantage instead of competing with your associates. Fedora allows for complex digital objects. The new Fedora Commons will provide a non-profit organization to support the growing community. The next conference will be held April 1-4, 2008.
Conference addresses archiving and preservation of e-journals. Phillip Pothen. JISC. 28 March 2007.
The uncertainty of long-term access to scholarly journals is a major issue for libraries and others. A recent conference discussed the topic and said that major concerns still remain even though progress has been made. A great deal of content is still at risk. Librarians should press the archiving programs to make sure they meet their archiving needs. Librarians are the custodians of the content. The group of libraries saving the data can do more than individuals alone. LOCKSS and Portico are some methods in use. The e-Depot in The Netherlands is also archiving journals from some publishers. “Old business models are breaking down while long-term archives require highly resilient architectures, long-term funding and a commitment to quality.” Blackwell suggests that 50% of all serials publications will be online by 2016, while 39% of science journals will be online by the end of this year. This means that there are considerable preservation challenges. Preservation, access, and open access are not the same thing. “Digital curation needs to be embedded in institutional strategies.” Responsibilities and requirements must be clear and agreed upon.
History 1980-2000 has disappeared into the ether. Sorry. Ben Macintyre. The Times. March 23, 2007.
This commentary warns of the short life of digital objects, which are “dangerously disposable.” Many do not bother to archive their digital data. Historians may look back at this period as a black hole. The most important real-time histories are written in online forums, which are fleeting. Many items have already been lost. The article ends with a plea for paper, which he feels is the best way to save things.
Tools and Methods for the Digital Historian. AHRC. March 23, 2007.
The Arts & Humanities Research Council (AHRC) has created an online forum, ‘Tools and Methods for the Digital Historian’ in order to encourage the exchange of ideas. The Methods Network is a UK initiative which provides a place for discussing digital history and research, but it open to all who want to register and discuss the issues. It also refers to a set of Working Papers.
FastStone Image Viewer 3.1. FastStone Website. April 16, 2007.
Update on the FastStone Image Viewer: This downloadable program is an image browser, converter and editor. The features include viewing, managing, comparing and other adjustments to images. It provides access to EXIF information, lossless JPEG transitions, embedded thumbnails, and image annotation. It supports all major graphic formats, BMP, JPEG, JPEG 2000, GIF, PNG, PCX, TIFF, WMF, ICO and TGA, as well as many RAW formats, such as CRW, CR2, NEF, PEF, RAF, MRW, ORF, SRF and DNG. It also supports saving files in pdf format.
Friday, April 06, 2007
Weekly readings - 06 April 2007
Photoshop has contained a plug-in for reading and writing jpeg2000 files. However, Adobe has not seen the widespread adoption of the format. With Photoshop CS2, they decided to stop installing the plug-in by default, though it is still currently available. If features no longer make sense, they will retire them in order to focus on what is most important. Adobe is trying to gauge the value of standalone jpeg2000 reading and writing. [Lots of comments on the blog.]
Questioning the Future of JPEG2000 Support in Photoshop. Peter Murray. TLDJ. April 5, 2007.
There is still some uncertainty about the format and whether it will be used much. The response to the Abode survey has been disappointing that not many use jpeg2000. It would be a shame if support were dropped since the format seems to be gaining ground. Google lists projects where some are working on wider adoption of the format.
Metadata mangling in Windows Vista. Stephen Shankland. CNet News. February 8, 2007.
Windows Vista and the Photo Info tool can cause problems with some images or the metadata. Some cameras use an EXIF Maker Note Tag in the image, and when updated, the digital camera software “may no longer recognize the metadata that is automatically added to the photo." There have also been reports of some compatibility issues and the files becoming “unreadable in other applications, such as Adobe Photoshop." Camera manufacturers may provide software for Vista users who want to open or print raw files.
Intel Gets More Time to Explain Lost E-Mails in Antitrust Case. Chris Preimesberger. eWeek. April 6, 2007
Intel has been granted more time by the court to explain how they will locate missing emails. Guidelines enacted in December require enterprises to be able to quickly find data files required by the court. Some of the items may have to be recovered from backup tapes or user backups, neither of these are indexed. The court said they had an “ill-conceived plan of document retention and lackluster oversight”. People at the highest level “failed to receive or to heed instructions essential for the preservation of their records”.
Friday, March 30, 2007
Weekly readings - 30 March 2007
Statement given by the Librarian of Congress concerning the Library of the 21st Century. It now takes 15 minutes to produce the same amount of information that it took LC over 200 years to acquire. Most exists only in digital form. “There is a widely-held but false assumption that digital materials accessible today … will necessarily be available in the future.” Also, “information not actively preserved today could literally be gone tomorrow.” Recent important digital materials, such as those on the internet, have not been preserved and have vanished. These are the primary sources of our time. A key challenge is to “capture, collect, preserve, and provide access to important ‘born-digital’ material and Web-based information.” LC manages about 295 TB of digital information. “The Library's basic mission of acquiring, preserving and making accessible the world's knowledge and the nation's creativity is not changing.” We can’t save everything, so we need to identify and select what is critical to the collection. “We are not just creating endless digital data files; we are giving our collections context and making them increasingly accessible to the world.” As we add to our collections we need an infrastructure that will make the content available in the future. A new asset is LC’s National Audiovisual Conservation Center which will preserve and make accessible the audio – visual collections.
Killing risk, unifying data protection. Jim Damoulakis. Computerworld. February 27, 2007.
It is important to look at what we are doing with data protection. Some of the techniques include nightly backup, snapshot, mirroring, database dumps, host-based replication, and storage array-based replication. One way to create a unified strategy is to look at the risks that exist. They include physical device failure, data loss through deletion or corruption, and disasters. Data loss can occur undetected, and there needs to be a way to protect against this.
Perspectives on Trustworthy Information. H.M. Gladney. Digital Document Quarterly. March 2007.
Digital preservation activities are shifting from solving basic problems to implementing solutions and repository procedures. Selection is a challenge of building a long-term digital collection, but it need to be balanced by practicalities. Archival objects need to include honest and adequate provenance information that is bound to the object. “Preserving an information collection is a different challenge than managing archives.” The need to preserve digital information, which is the base of most scientific research, is self-evident. Snapshots and logs may be sufficient for preserving databases.
JHU/UVA Medieval Manuscript Digitization Workshop. Timothy Stinson. Blog. March 28th, 2007.
This quote is from the blog report of the digitization workshop: “Staples has a great way of thinking about preservation - he pointed out that preservation isn’t simply a technological solution, an archive, e.g., where we can stick things and have them safe forever. Rather preservation is the result of usage, maintenance, and institutional commitment. Those things that are used the most, he argued, are the same ones that are migrated the most frequently, and are the least likely to become invisible and forgotten or to cease to be a priority to individuals and institutions. We need not only technical solutions, but also wide access and modeling of data in such a way that it is frequently used, migrated, and repurposed.”
Calif. CIO Steers Clear of Ideology on File Formats. Carol Sliwa. Computerworld. March 19, 2007.
The question of open formats is not an ideological struggle between competing visions of the future. It is a straight business decision, looking at the costs of one approach over another and deciding if it meets the business needs. They don’t have a preference between ODF and Office Open XML file format, but they are moving to interoperability and things that are more open and stop being locked in to proprietary systems. Open, XML-based formats provide flexibility.
Friday, March 23, 2007
Weekly readings - 23 March 2007
Trustworthy Repositories Audit & Certification (TRAC): Criteria and Checklist. Robin Dale, et al. CRL. March 9, 2007.
TRAC is the revised and expanded version of the Audit Checklist originally developed by RLG-NARA. The 94 page report provides a very complete method for checking and certifying long-term repositories. It can also be used for planning and guiding the development of repositories. The document looks at Organizational Infrastructure, Digital Object Management, and Technologies, Technical Infrastructure, & Security, and provides a checklist of criteria for measuring the trustworthiness of repositories. Another link to the site.
e-Journals: Archiving and Preservation. Briefing Paper. JISC. March 2007.
The traditional model of publishers supplying content and libraries preserving content does not work well with digital materials. Licensing agreements do not guarantee permanent access to materials. But the e-journal trend is increasing at a rapid rate. Many are searching for the solution. The terms ‘perpetual access’, ‘archiving’, and ‘long-term preservation’ are often used interchangeably. Perpetual access is usually used with e-journal licenses clauses to assure that access will be continued regardless of events. Archiving describes the management processes of e-journals. Long-term preservation refers to the processes to ensure the content remains accessible in the future, regardless of any technical or organizational changes. There needs to be multiple options and strategies for preserving e-journals, including coordinated overlap. There are promising developments evolving, but the solutions must include libraries, publishers, and archiving services.
Iron Mountain launches active archiving for email. Computer Technology Review. March 20, 2007.
Iron Mountain has introduced an Active Archiving Service for email. This is a single solution which includes management, archiving, legal discovery, continuity and disaster recovery. Most legal discovery processes now include email. The new federal rules make an email archive critical. It is integrated with Outlook and allows users full access to emails and the ability to restore individual messages. The cost starts at $6 per user per month.
Metadata for All: Descriptive Standards and Metadata Sharing across Libraries, Archives and Museums. Mary W. Elings, Günter Waibel. First Monday. 5 March 2007.
The cultural heritage community has a large pool of digital resources for teaching, research and learning. A big challenge is integrating digital content from libraries, archives and museums which use different strategies for caring for their materials. Applying data content standards by material type rather than the organization could make the data more usable within the entire community. Two schema used are Visual Resources Association (VRA) Core and the Categories for the Descriptions of Works of Art (CDWA). The article lists the elements of metadata standards, and the relationship between them and museums, libraries and archives. There is a call among archives to process collections more efficiently so they achieve control over all their holdings. The successful use of digital materials in libraries, museums, and archives revolves around the ability to describe similar materials in different institutions.
Microsoft Announces HD Photo, a New Imaging File Format With Advanced Features for Today’s Digital Photographers. Press release. March 8, 2007.
Microsoft announced a new file format for that offers higher image quality, greater preservation of data, and advanced features. HD offers both lossless and lossy image compression. When compressed it has twice the efficiency of JPEG, with fewer artifacts. It preserves the entire original image. They also released a plug-in for Photoshop. [See Photoshop gets HD Photo support.]
Dell to ship PCs with 1TB drives. Chris Mellor. Techworld. March 16, 2007.
Dell will ship computers with Hitachi 1 TB drives, targeting users who wish to store large amounts of data. The computer can handle up to 4 TB. The drives use perpendicular recording. The 1TB drive is priced at $540. Dell is also introducing a 'video time capsule service' where users can upload videos to a site where Dell will store them for a claimed 50 years.
Blu-ray Aims to Oust DVDs Within Three Years. eWeek.. March 15, 2007.
The Blu-ray disc association has said it aims to replace the DVD in three years. The Blu-ray disc holds 5 times as much as a DVD. While there is still uncertainty over which format will gain control of the market, there is movement away from the DVD format.
A Digital Life. Gordon Bell, Jim Gemmell. Scientific American. February 18, 2007.
New systems may allow people to record everything they see and hear--and even things they cannot sense--and to store all these data in a personal digital archive. The MyLifeBits project has provided the tools to create a person’s lifelong digital archive. Technological advances may make this easier but there are challenges, particularly with privacy rights and restrictions. They believe digital memories will yield benefits in many areas.
Hammer Storage Pounds Out 'Disruptive' 1TB Appliance. Chris Preimesberger. eWeek. March 22, 2007.
Hammer Storage has introduced Myshare, a new plug and play storage device with 1TB for $499. It can be used on a network, and the content can be made available through a web application, including selective access to folders. The content can also be mirrored, secured, and it allows multiple user and group permissions.
Wednesday, March 21, 2007
Scholarly Communication
Web 2.0 Presentation to BYU Library. Gideon Burton. Blog. March 13, 2007.
This was a presentation to the library on trends in Scholarly Communication, web 2.0, and other topics. This page includes his PowerPoint presentation and links to a video shown at the meeting, "The Web is Us/ing Us" by Michael Wesch.
Oops! Techie wipes out $38 billion fund
Friday, March 16, 2007
Weekly readings - 16 March 2007
History, Digitized (and Abridged). Katie Hafner. New York Times. March 10, 2007
Archives and museums hold many important items that will probably not be digitized in the near future. This increases the possibility that they will be ignored as people expect more that all information is on the internet. A major problem is the cost of digitizing materials. Many items will still exist only in paper, LPs, magnetic tape and film. Libraries tend to digitize the items that are unique to their collection. But by putting the items on the internet, the number who use them increase dramatically. The
Director's Message. Anne-Imelda M. Radice. News & Events. March 2007.
The Webwise Conference: Stewardship in the Digital Age: Managing Museum and Library Collections for Preservation and Use highlighted the huge shift underway in museums and libraries. In a short time they have gone from knowing almost nothing about preserving digital objects to now understanding that digitization is an important part of conservation and use. Besides preserving the physical objects, institutions realize they need digital repositories for collections that are:
- physically vulnerable
- on fragile or unstable media
- born digital
Digitization protects historically important collections and addresses future collections. There is a great need to develop a new set of digital preservation skills in order to address digital objects. These collections can increase public awareness and interest in existing collections that may currently be unknown. Digital stewardship is an important part of the overall mission of libraries and museums as they care for their collections.
Quad-layer DVD Technology Becomes the Third HD Format. Marcus Yam. DailyTech. March 11, 2007.
New Medium Enterprises (NME) has developed the Versatile Multilayer Disc (VMD), a new optical-based format capable of storing 20GB of data. VMD is a red-laser technology that achieves its storage capacity by using a greater number of layers. VMD is the same size and thickness as DVD. However, while DVD technology uses two layers of a disc, VMD technology has multi-layering where up to 5GB can be stored on each layer.
Version 3.0 Launched. Mia Garlick. Creative Commons Website. February 23, 2007.
The latest version of the Creative Commons license is now available. A new generic license has been created. The new licenses ensure that there is consistent, express treatment of the issues of moral rights and collecting royalties and that there are no legal barriers to people being able to remix creativity as intended.
Intel Faces Up to E-Mail Retention Problems in AMD Lawsuit. Chris Preimesberger. eWeek. March 7, 2007.
A
Principles for Digitized Content. ALA. Website. March 2, 2007.
An
1. Digital libraries ARE libraries and
2. Digital content must be given the same consideration as regular content, including preservation.
3. Digital collections must be sustainable and requires long-term management capabilities.
4. Digitization requires collaboration which will require strong organizational support
5. Digital activity requires ongoing communication for its success.
6. Digital collections increasingly address an international audience.
7. Digital collections are developed and sustained by educated staff, requiring continuous learning
8. Digital materials require appropriate preservation, including the development of standards, best practices, and models for sustainable funding to guarantee long term commitment.
9. Digital collections and their materials must adhere to standards, serve the broadest community of users, support sustainable access and use over time, and promote the core library values
Model Plan for an Archival Authority Implementing Digital Recordkeeping and Archiving. Australian Digital Recordkeeping Initiative (ADRI). 2 March 2007.
This 32 page Word document is a list of the components, tasks and resources needed to develop a digital recordkeeping / archiving capability. It addresses creating recordkeeping standards and developing a digital archives repository. It is based on the OAIS model. It outlines the strategies, functions, and tasks to develop, implement, and review the repository. The functions for preservation planning are:
- Monitor / interact with the designated community to understand requirements and changes
- Monitor the emerging technology and standards
- Develop and recommend preservation strategies
- Develop packaging designs and detailed migration plans and prototypes
- Implement administrative policies and directives
Friday, March 09, 2007
Weekly readings - 09 March 2007
The Digital Curation Centre and DigitalPreservationEurope have released the Digital Repository Audit Method Based on Risk Assessment (DRAMBORA) toolkit. This is to provide repository administrators with a way to audit and assess the capabilities, weaknesses, and strengths of their repository. This model is designed to respond to developments; it follows the DCC audit process for various types of archives. This 221 page document is very comprehensive and expects that the organizations, processes, and documents are already in place. Registration is required to download the document and worksheets.
Review and analysis of the CLIR e-Journal Archiving survey. Maggie Jones. JISC. 7 March 2007.
A review of e-journal archiving is now available. This site has a link to the report “E-Journal Archiving: Review and Analysis of the CLIR Report E-Journal Archiving Metes and Bounds: A Survey of the Landscape, by Maggie Jones”, as well as an executive summary. The trend to e-journals is increasing. LOCKSS and Portico have provided a momentum in this area. The report lists the basic principles for this effort, mostly for the services and nationally. Of special interest are:
- There must be an explicit commitment to digitally archive scholarly peer-reviewed journals.
- Form a network to exchange information with others on what you are doing.
- Participate in at least one long term initiative.
- Act collectively to address long-term accessibility
- Participate in a registry of archived scholarly publications
- Have a preservation mandate
- Initiate either formal or informal certification and articulate practices and procedures
- Create publicly accessible policies and procedural documents.
- Clearly state access conditions
- View the preservation of electronic journals as a necessary investment.
- Support a range of options and solutions and collaborate
Manage risks appropriately. Having the publisher responsible for archiving is a high risk strategy. A single definitive approach to e-journal archiving is unlikely ever to emerge. The article provides an in-depth look at LOCKSS, CLOCKSS, Portico, PubMed Central, and others.
Electronic Resources Management and Long Term Preservation: (Is the library a growing organism?) Tommaso Giordano. E-LIS. 02 March 2007.
Digital preservation is a complex issue that involves many areas of expertise. This paper looks at how academic libraries see preserving e-journals and the organizational practices. License controls may control access to the current year, back issues, and an archival copy. Perpetual access and archiving right are different. Getting an archive copy is only the first step, there is more to implementing long-term access. Digital preservation may not be possible for every library. Digital preservation is a costly operation which considerable and long-term commitments, beyond current budgets. This is a high-level strategic issue. There is a shift from the traditional model to one based on renting resources with no guarantees for the future. Sustainability is a question.
The End of Online Storage: Coming Soon. Brian Bergstein. MCPmag. March 5, 2007.
A new study estimates that the amount of digital information the world is generating has increased dramatically. The study tries to account for photos, videos, e-mails, web pages, instant messages, phone calls and other digital content. They estimate that 40 exabytes of original data was created last year, and that it would be 161 exabtyes if you count the times it is duplicated. They wonder if enough is being done to save the digital data for posterity. "Someone has to make a decision about what to store and what not. How do we preserve our heritage? Who's responsible for keeping all of this stuff around so our kids can look at it, so historians can look at it? It's not clear."
Google helps terabyte data swaps. Darren Waters. BBC News. 7 March 2007.
Google is developing a program to physically transfer large data sets around the internet. Some data sets may be 120 terabytes in size. Google is collecting the data sets and sending them to scientists who want them. This was started after researchers working with The Archimedes Palimpsest had problems transferring the enormous data sets from place to place. “Google keeps a copy and the data is always in an open format, or in the public domain or perhaps covered by a creative commons license.” They hope some day the data can be open to the public.
Pre-Pixel Preservation: Concept Device to Archive and Preserve the Past. Naveen Shimla. Gizmo Watch. February 27, 2007.
An interesting look at “Pre-Pixel Preservation”, meaning scanning and printing photographs and videos, or rather the “pre-digital files”.
Friday, March 02, 2007
Weekly readings - 2 March 2007
Library Copyright Alliance Strongly Supports H.R. 1201, the FAIR USE Act. Jonathan Band. District Dispatch. February 28, 2007.
The FAIR USE Act would make a film clip exemption applicable to all classrooms not just college media studies classes. It would also allow a library to legally circumvent technological protections in order to preserve encoded works in a library's collection. Preservation is one of the most critical library functions. The DMCA provides obstacles to a library’s ability to preserve some objects; the FAIR USE Act will remove the obstacle without harming copyrights.
Hard Disk MTBF: Flap or Farce? David Morgenstern. eWeek. February 28, 2007.
Computer hard drive reliability is rated in hours: known as MTBF (mean time between failures). Some drives are rated at 1.5 million hours, which is over 100 years. Now some are questioning this statistic. They have found that replacement rates are sometimes as high as 13%, and there is little difference between SCSI, FC, and SATA drives. In reality, there isn't a reliable way to statistically determine the reliability of disk drives that are in use. Expecting that data won’t get lost is not realistic, even with a Raid array. Budgets must include regular hard drive replacement.
Back stage at the Oscars with Martin Scorsese: The press interview. Movie News. February 27, 2007.
This interview contains a brief discussion with Martin Scorsese about film archiving and preservation. He is a proponent of film preservation and film archiving. “It's very important. We don't know what new technology is coming down the line. Digital also fades. We have to be very careful.” There is so much to be done, but choices have to be made. They are trying to restore older films on celluloid, but it is expensive and only a very few can be done. If the only thing you can do is transfer them to digital to preserve them, then that may have to be done. With small budgets, preservation is a serious problem.
Adobe to take Photoshop online. Martin LaMonica, Mike Ricciuti. CNET News. February 28, 2007
Adobe plans to release an online version of Photoshop within six months. The lower-end version is expected to be free and will be an entry-level version of the full products. Adobe is trying to find how to use web services for their products. Other vendors have already started doing similar things. Google’s free Picasa program allows users to manage images and to read Adobe Photoshop files.
California may join rush of states toward ODF. Eric Lai. Computerworld. February 27, 2007.
A bill in the
Friday, February 23, 2007
Weekly readings - 23 February 2007
Google Study Examines Effects on PC Hard Drives. Mark Hachman. ExtremeTech. February 20, 2007.
Disk drives are generally reliable, but a study shows that current methods of predicting hard drive failure are almost ineffective, while basic disk checks can show if a drive is about to fail. The study looked at over 100,000 drives from different manufacturers over a 5 year period. There was no clear pattern to show that higher temperatures, higher utilization or activity levels affected the failure rate. Lower temperatures and very high temperatures had more failures. Scan errors and reallocation checks were a better indication that a drive would fail. If there is even one scan error, there is a significantly higher rate of failure within 60 days. Drive failure is important to deal with, as over 90% of all new information is stored on hard drives and other magnetic media.
MP3's Loss, Open Source's Gain. Eliot Van Buskirk. Wired. February 23, 2007.
Alcatel-Lucent was awarded $1.52 billion by a federal jury in an MP3 patent infringement suit against Microsoft, even though they licensed the software from Fraunhofer/Thomson, the industry-recognized licensee of MP3. The result will be appealed. But if upheld, it could start an all-out licensing / lawsuit campaign. It could possibly extend to all companies involved with MP3 encoding or playback. This uncertainty could move the industry away from MP3, and that could be beneficial for open source software or other formats. Some of these are:
- Ogg Vorbis, (open-source with better sound quality, but royalty questions)
- AAC (based on MPEG-4, it has greater fidelity at higher compression rates)
- Window’s Media format
But the patent questions could extend to other software as well, which could influence the use of open source software. [See James Hilton’s speech at OR2007]
Why organizations need to archive email. White paper. GFI Website. February 22, 2007.
Emails have become the electronic substitutes of legal business documentation and the correspondence constitutes a record which has a retention period. A ‘true’ email archiving system will automatically extract and index the content of the message and attachments from emails, stores the email in read-only format so it cannot be changed. This also decreases the online email storage. Backups and archives are not the same; backups are to guard against system failure, while archives protect the data so it can be accessed when needed and can restrict access to authorized users. Email archives are important for compliance issues, litigation support, and storage / knowledge management. An email archive should have:
- Minimal user intervention and automatic processing
- Ability to index, search, and retrieve records and attachments
- Data retention selection and control by policies
- Security and authenticity which must include ability to restrict access
- End-user and management access to archives
- Support for multiple messaging platforms
Medieval Stained Glass in Great Britain. AHDS Website. February 22, 2007.
A major digitization project has added over 18,000 images of stained glass windows in
Friday, February 16, 2007
Weekly readings - 16 February 2007
Digital Preservation in a National Context: Questions and Views of an Outsider. H.M. Gladney. D-Lib Magazine. January/February 2007.
“A solution is known in principle for every difficult technical problem of digital preservation.” Non-technical preservation challenges are greater than the technical challenges. Preservation is only a small part of “archiving”. Some information may disappear, including some that was supposed to be permanent. “Curators need to learn to live with not knowing for sure that they have succeeded.” A preservation solution would incorporate methods for:
- Ensuring that each saved bit-string survives as long as somebody might want it;
- Ensuring that readers can find and use any preserved object as its producers intended;
- Providing evidence with which readers can judge information authenticity;
- Integrating preservation support seamlessly with current information services; and
- Hiding technical complexity from end users.
There needs to be a "trustworthy digital object" with metadata about the object and its relationship to other objects. Union catalogs of preserved objects could
Texas, Minnesota eye move to ODF. Elizabeth Montalbano. Computerworld. February 07, 2007.
Legislative action is being considered to mandate that government documents use an open, interoperable, XML-based file format. The
U.S. House Votes to Rescind NDIIPP Funding; Bill Now Under Consideration by Senate. Peter Murray. The Disruptive Library Technology Jester. February 11, 2007.
House of Representatives Resolution 20 rescinds a number of items for the Library of congress, especially “the unobligated balances available for the National Digital Information Infrastructure and Preservation Program, $47,000,000.” This will remove the funding for this program for the rest of the year.
Arts and Humanities Get Small Increases. Lauren Smith. The Chronicle of Higher Education. February 16, 2007.
In the proposed 2008 fiscal year budget, the Electronic Records Archives and the
64 DVDs on a disc: holographic storage to ship. Lucas Mearian. Computerworld. February 12, 2007.
InPhase Technologies will begin shipping the industry's first holographic disc drive in July. The holographic disc will hold 300GB of uncompressed data and 300GB of error correction and data redundancy. It is a write once disc intended for the archival market and has a 50 year expected lifespan. To a server, the disc will look like a drive with drag and drop capability. The holographic drive will cost $18,000 and the discs will cost $180 each. The company expects to have a rewritable disc in 2008, and a 1.6TB disc by 2010. It also plans to have a holographic jukebox in 2008 with a capacity of 675TB.
Friday, February 09, 2007
Weekly readings - 9 February 2007
Digital Curation for Science, Digital Libraries, and Individuals. Neil Beagrie. The International Journal of Digital Curation. Autumn 2006.
Digital Curation is becoming a more common term. It refers to the actions to maintain digital materials for their entire life-cycle. The term, along with ‘digital preservation’ and ‘digital archiving’, is still evolving. Terminology has different meanings to different people. The terms all mean that we are developing a different approach to creating and managing digital materials. One comment defines them this way: “these are terms of increasing specificity in this context: preservation is an aspect of archiving, and archiving is an activity needed for curation. All three are concerned with managing change over time.”
Another definition is: “Digital curation, broadly interpreted, is about maintaining and adding value to, a trusted body of digital information for current and future use.” While not all digital information has long term value, a significant part of it will, though it will vary by area. Therefore, “curation and long-term preservation of digital resources could be of increasing importance for a wide range of activities.” Digital curation has implications for many different areas. The data must be continuously updated. “Significant effort needs to be put into developing persistent information infrastructures for digital materials” and for researchers and information professionals to develop the needed curation skills. Without this, the digital information will only have short term benefits.
A Vision for FEDORA’s Future, an Implementation Plan to Get There, and a Project Update. Peter Murray. Disruptive Library Technology Jester website. 24 Jan 2007.
This is a review of an update on Fedora at the Open Repositories 2007 conference. Many different kinds of projects will be using Fedora for
- repository services: managing, accessing, versioning, and storing digital materials
- preservation services: integrity checking, monitoring, alerting, migrating, and replicating
- process management services: workflow and messaging applications
- collaboration services: annotating, discussing, and rating digital objects
The Fedora project is evolving into an organization called Fedora Commons. It will be a non-profit organization, to allow users to better collaborate on projects and use the information better. This will be a multi-year effort to build the organization to be more responsive to user needs and create a more robust product.
Open Source for Open Repositories — New Models for Software Development and Sustainability. Peter Murray. Disruptive Library Technology Jester website. 24 Jan 2007.
This is a summary of an excellent presentation at the Open Repositories 2007 conference by James Hilton. Organizations may be more willing to turn to open source software in a systematic way because of:
- Fear. Business decisions by vendors lessen the comfort of buying a software application.
- Disillusionment. Software seems to bring an endless upgrade cycle and the institutions still need to build in the support structure.
- Incredulity. Software is disruptive, expensive, and may not lead where they need.
- Increasing collaboration. In the ‘new order’ the new competitive advantage will be picking the right collaborative partners.
There are different meanings to ‘open’; it does not always mean ‘free’, and this needs to be reviewed carefully to determine the consequences. The benefits of open source may be that you can control your own destiny, it builds community support, it separates ownership from support, and leverages the links between the institution and others. The challenges may be that “clean” code is impossible to guarantee, licenses and patents may be difficult to manage, and lawsuits may happen. Open source is more of a commitment to build. Licensing is a contract and must be maintained and understood. Communities don’t just happen; they require shared purpose, governance, discipline, and cooperation. If institutions are going to use open source, they must make commitments to it.
iSymantec software captures IM traffic. Lucas Mearian. Computerworld. January 31, 2007.
Symantec Corp. announced Veritas Backup Reporter 6.0, an enterprise backup reporting tool that gives IT administrators a single corporate view of backup and recovery operations and better able to perform capacity planning. Enterprise Vault 7.0 allows IT managers to archive and classify e-mail, instant messaging and other content either automatically, by user classification, or integrated with records management systems.
Friday, February 02, 2007
Weekly readings - 2 February 2007
Adobe to Release PDF for Industry Standardization. Press release. Jan. 29, 2007.
Adobe announced that it will release the full PDF 1.7 specification to AIIM to be published by the International Organization for Standardization (ISO). This is to help the process to make it an ISO standard. Adobe said that this is “reinforcing our commitment to openness.”
The Saga Of the Lost Space Tapes. Marc Kaufman.
Millions of people saw the video of the moon landing in 1969. What most don’t know is that the “camera had actually sent back video far crisper and more dramatic”, but which only a few people have seen. The high-quality tapes, which were in a highly specialized format, were stored and forgotten. Now NASA has started looking for those tapes, but after an official search through archives, record centers and storage rooms, NASA has acknowledged that the videos are lost. Everyone assumed that NASA would archive the tapes. "Maybe somebody didn't have the wisdom to realize that the original tapes might be valuable sometime in the future. Certainly, we can look back now and wonder why we didn't have better foresight about this."
Seagate drive has gigabytes of wireless, pocket storage. Ben Ames. Computerworld. January 30, 2007.
Seagate unveiled a wireless 10GB to 20GB storage device intended to fit in users' pockets and allow them to store and share digital files between mobile phones, PCs and other mobile platforms. This device called Digital Audio Video Experience (DAVE) has a 1-in. hard drive and can use Wi-Fi networking to share files with another device within 30 feet. This can be used to deliver video files without latency or coverage problems, since the files can be downloaded to the hardware at leisure instead of streamed live through mobile networks,
Opinion: Ultrasimple image backup. Steve Bass. Computerworld. January 31, 2007.
The Polaroid Media Backup Photo Edition is a 40GB external drive that can easily back up over 60 different image file types. It was designed for simplicity; once it is connected to a USB port on a computer, the device is prompted to find any images and start backing them up. It is plug and play; there is no software to install and there is no on-off switch. The 2.5-inch 40GB hard drive can hold up to about 40,000 regular-sized photos and can be used with Internet services for sharing and printing. The cost will be about $129.
Friday, January 19, 2007
Weekly readings - 19 January 2007
Microsoft Unveils Wave of New Products and Services at CES. Press release. Jan. 7, 2007.
Microsoft has unveiled their Home Server, which provides a central place to help store, protect and access all the digital content in the home. This is a new software product intended for homes with multiple PCs to connect their computers, digital devices and printers to help store, protect and share their digital collections. This is to help consumers deal with their rapidly increasing digital content (e.g. 273 billion digital images were captured worldwide in 2006). In the
Architectural Considerations for Archive and Compliance Solutions. White paper. Computerworld. January 8, 2007.
Within the IT industry there is no agreement on a definition of ‘archive’. For some it means moving inactive data; for others it is permanent, managed storage. Long-Term Data Retention is becoming more important. Data retention periods have increased, in some cases to 100 years or more.
Maintaining data in a usable format is critical because we don’t know what applications will be used long term. Saving data in its native format may be a way to enable organizations to access it in the future.
This report summarizes a review of 12 e-journal archiving programs from the perspective of concerns expressed by directors of academic libraries in
SDSC Releases Open-Source iRODS Data Management System. January 10, 2007.
Large data collections are bringing dramatic results. Large collections can exceed 100 TB in size, and are difficult to manage. The San Diego Supercomputer Center (SDSC) has released the latest open-source version of iRODS, the Integrated Rule-Oriented Data System, which represents a new approach to distributed data management. iRODS supports data grids, digital libraries, and persistent archives. Managing data consists of a large number of complex inter-related tasks.
Friday, January 12, 2007
Weekly readings - 12 January 2007
Digital Preservation News: January 2007. Library of Congress. January 2007.
The news of what is happening at the Library of Congress with digital preservation. It includes a number of items worth reading:
· The Birth of the Dot-Com Era. This closed archive will serve as a model of a trusted institutional repository.
· The ‘famous’ Cathy comic strip on digital preservation.
· The NDIIPP 2005 Annual Review which gives a good overview of what has been happening in this program.
LOCKSS - Floating electronic librarianship to a higher level. Stuart Weibel. Blog. January 09, 2007.
The LOCKSS model tries to make the Web behave more like library shelves. While this may not seem really exciting, it is “arguably among the most important missing links in making digital libraries solid enough to bear up as reliable stores of cultural assets.” Two questions about electronic library data are:
1. Who has custody?
2. Who gets access?
These questions must be decided and LOCKSS in part returns management of the collection back to the library rather than just renting data. It also addresses the problem of format migration by supporting HTTP content negotiation, which means giving the user the choice of different versions of a document that best fits their situation. Another prospect is using LOCKSS as a low-cost means of preserving unpublished or ephemeral materials, which fits with the role of libraries as managers of unique collections. LOCKSS is looking at blogs, institutional repositories, and other possibilities. The technology is carefully thought out, low in cost, high in impact, and which can use the collaboration that defines the library community.
Hitachi announces one terabyte hard drive. Press Release. Computer Technology Review. January 9, 2007.
Hitachi announced a one terabyte (TB) hard drive, the Deskstar 7K1000, will begin shipping to retail customers in the first quarter of 2007 at a suggested retail price of US$399. It uses the perpendicular magnetic recording (PMR) technology.
New tape specification 50% faster with 800GB. Deni Connor. Computerworld. January 10, 2007.
The new LTO tape specification was just released. It can store up to 800GB in uncompressed mode. It includes encryption and allows for an interchange between HP and IBM equipment, and can also read LTO2 and LTO3 tapes. “The Library of Congress collections could fit on six LTO 4 cartridges.”
A companion to digital humanities. Schreibman, Susan (ed). Blackwell Publishing. January 9, 2007.
The electronic version of this resource. comprehensive description of the history, development and current status of the digital humanities and humanities computing.
Divided into four parts - history; principles; applications; and production, dissemination and archiving
It includes sections on “The Past, Present, and Future of Digital Libraries” by Howard Besser and “Preservation” by Abby Smith. Some quotes include:
· “in the digital realm, the ability to know about, locate and retrieve, and then verify (or reasonably assume) that a digital object is authentic, complete, and undistorted is as crucial to "fitness for use" or preservation as it is for analogue objects”
· The general approach to preserving analogue and digital information is exactly the same – to reduce risk of information loss to an acceptable level – but the strategies used to insure against loss are quite different.
· a digital object's file format and metadata schema greatly affect its persistence and how it will be made available in the future.
· those who create intellectual property in digital form need to be more informed about what is at risk if they ignore longevity issues at the time of creation.
· scholars should be attending to the information resources crucial to their fields by developing and adopting the document standards vital for their research and teaching, with the advice of preservationists and computer scientists where appropriate.
· As long as those cultural and intellectual resources are under the control of enterprises that do not know about and take up their preservation mandate, there is a serious risk of major losses for the future, analogous to the fate of films in the first 50 years of their existence.
· Scholars cannot leave it to later generations to collect materials created today. They must assume a more active role in the stewardship of research collections….