New data supports finding that 30 percent of servers are ‘Comatose’, indicating that nearly a third of capital in enterprise data centers is wasted. Jonathan Koomey and Jon Taylor. Anthesis Group. June 3, 2015.
The report shows that utilization of servers in business and enterprise data centers “rarely exceeds 6%” (i.e, they deliver no more than six percent of their maximum computing output on average over the course of the year) and two groups have found that up to 3.6 million servers (30%) are comatose, meaning that they are using electricity but delivering no useful information services (abandoned or never used).
"In the twenty first century, every company is an IT company, but too many enterprises settle for vast inefficiencies in their IT infrastructure. The existence of so many comatose servers is a clear indication that the ways IT resources in enterprises are designed, built, provisioned, and operated need to change. The needed changes are not primarily technical, but revolve instead around management practices, information flows, and incentives." The two groups offer online efficiency calculators.
This blog contains information related to digital preservation, long term access, digital archiving, digital curation, institutional repositories, and digital or electronic records management. These are my notes on what I have read or been working on. I enjoyed learning about Digital Preservation but have since retired and I am no longer updating the blog.
Saturday, June 20, 2015
Friday, June 19, 2015
Digital Preservation Metadata and Improvements to PREMIS in Version 3.0
Digital Preservation Metadata and Improvements to PREMIS in Version 3.0. Angela Dappert. May 27, 2015. [PDF]
This is the notes from a DCMI/ASIS&T joint webinar about PREMIS v. 3. The PDF document has 63 slides which gives an overview of why digital preservation metadata is needed, shows examples of digital preservation metadata, shows how PREMIS can be used to capture this metadata, and shows some of the changes in version 3.0.
Digital preservation metadata is the metadata needed to ensure long-term accessibility of digital resources. Digital objects must be self-descriptive independently from the systems that were used to create them. PREMIS is the de-facto standard for metadata to support the preservation of digital objects and ensure their long-term usability. It is a common data model for organizing/thinking about preservation metadata, or for exchanging information packages between repositories. It is not an out-of-the-box solution, nor all the metadata needed.
This is the notes from a DCMI/ASIS&T joint webinar about PREMIS v. 3. The PDF document has 63 slides which gives an overview of why digital preservation metadata is needed, shows examples of digital preservation metadata, shows how PREMIS can be used to capture this metadata, and shows some of the changes in version 3.0.
Digital preservation metadata is the metadata needed to ensure long-term accessibility of digital resources. Digital objects must be self-descriptive independently from the systems that were used to create them. PREMIS is the de-facto standard for metadata to support the preservation of digital objects and ensure their long-term usability. It is a common data model for organizing/thinking about preservation metadata, or for exchanging information packages between repositories. It is not an out-of-the-box solution, nor all the metadata needed.
Thursday, June 18, 2015
Funding for preservation software development
Funding for preservation software development. Gary McGath. File Formats Blog. June 9, 2015.
The Open Preservation Foundation is launching a new model for funding the development of preservation-related software. The model allows organisations to support individual digital preservation software products and ensure their ongoing sustainability and maintenance. "US libraries have been rather insular in their approach to software development. They’ll use free software if it’s available, but they aren’t inclined to help fund it. If they could each set aside some money for this purpose, it would help assure the continued creation and maintenance of the open source software which is important to their mission."
The Open Preservation Foundation is launching a new model for funding the development of preservation-related software. The model allows organisations to support individual digital preservation software products and ensure their ongoing sustainability and maintenance. "US libraries have been rather insular in their approach to software development. They’ll use free software if it’s available, but they aren’t inclined to help fund it. If they could each set aside some money for this purpose, it would help assure the continued creation and maintenance of the open source software which is important to their mission."
File identification tools, part 5: JHOVE
File identification tools, part 5: JHOVE. Gary McGath. File Formats Blog. June 11, 2015.
JHOVE is a tool that identifies and validates AIFF, GIF, HTML, JPEG, JPEG2000, PDF, TIFF, WAV, XML, ASCII, and UTF-8 files. Unrecognized files are called a “Bytestream.”
Key concepts in JHOVE are “well-formed” and “valid.” A file which is “well-formed but not valid” has errors, but not ones that should prevent rendering. JHOVE focuses on the semantics of a file rather than its content. It only reports full conformance to a profile. It won’t tell you why it fell short.
Download JHOVE from GitHub the Open Preservation Foundation; (do not download from SourceForge). Documentation is on the OPF website. A developer's guide is also available: JHOVE Tips for Developers.
It shouldn't be confused with JHOVE2 which does similar things but has a different code base.
JHOVE is a tool that identifies and validates AIFF, GIF, HTML, JPEG, JPEG2000, PDF, TIFF, WAV, XML, ASCII, and UTF-8 files. Unrecognized files are called a “Bytestream.”
Key concepts in JHOVE are “well-formed” and “valid.” A file which is “well-formed but not valid” has errors, but not ones that should prevent rendering. JHOVE focuses on the semantics of a file rather than its content. It only reports full conformance to a profile. It won’t tell you why it fell short.
Download JHOVE from GitHub the Open Preservation Foundation; (do not download from SourceForge). Documentation is on the OPF website. A developer's guide is also available: JHOVE Tips for Developers.
It shouldn't be confused with JHOVE2 which does similar things but has a different code base.
File identification tools, part 4: ExifTool
File identification tools, part 4: ExifTool. Gary McGath. File Formats Blog. June 8, 2015.
The ExifTool, which analyzes Exif metadata, is an open source library and command line tool for identifying, editing, and extracting metadata from many formats, such as image, movie, and audio files. The command options can return metadata in a variety of formats or edit files.
The ExifTool, which analyzes Exif metadata, is an open source library and command line tool for identifying, editing, and extracting metadata from many formats, such as image, movie, and audio files. The command options can return metadata in a variety of formats or edit files.
Wednesday, June 17, 2015
IFI Irish Film Archive publishes new Digital Preservation and Access Strategy
IFI Irish Film Archive publishes new Digital Preservation and Access Strategy. Seán Brosnan. Irish Film and Television Network. 16 Jun 2015. [PDF]
The Irish Film Archive has published a new Digital Preservation & Access Strategy that addresses archiving vast quantities of moving image material in a digital environment. “This crucial document ... outlines a long-term plan and a set of guiding principles, flexibility, scalability and sustainability, which will assist us in preserving and providing access to Ireland’s digital moving image material.”
In order to fully achieve its mission, the organization must be able to preserve and make available the digital collections in its care, by doing the following items:
The digital strategies and solutions must ensure efficiency and cost effectiveness by being flexible, scalable, and sustainable. The Digital Preservation and Access Strategy must consider the short-, medium-and long-term needs of the organization in safeguarding and sharing digital collections. The risks and challenges they face include:
The document also describes benefits, opportunities, stakeholders and strategic priorities. The strategic objectives include:
Digital Preservation & Access Strategy Document will be an essential tool in helping the organization become a Digital Archive that meets internationally recognized standards of excellence. This will be a living document and will reviewed regularly to ensure it still meets the strategic objectives. It will take into account any changes in access or preservation technology, policies, and budget changes. The document will also be supplemented by additional policies over time.
The Irish Film Archive has published a new Digital Preservation & Access Strategy that addresses archiving vast quantities of moving image material in a digital environment. “This crucial document ... outlines a long-term plan and a set of guiding principles, flexibility, scalability and sustainability, which will assist us in preserving and providing access to Ireland’s digital moving image material.”
In order to fully achieve its mission, the organization must be able to preserve and make available the digital collections in its care, by doing the following items:
- Publish a comprehensive Digital Preservation and Access Strategy document.
- Upgrade the Archive’s technology and equipment to facilitate digital activities.
- Develop policies and procedures to support the digital strategy and asset management
- Develop a trusted digital repository
- Increase access to the collections
- Secure funding to ensure digital preservation and access are supported and sustainable.
- Develop specialist digital management and preservation skills
- Develop the Archive’s existing database to allow the recording necessary metadata
- Follow international examples of best practice
- Create and maintain on-going partnerships
The digital strategies and solutions must ensure efficiency and cost effectiveness by being flexible, scalable, and sustainable. The Digital Preservation and Access Strategy must consider the short-, medium-and long-term needs of the organization in safeguarding and sharing digital collections. The risks and challenges they face include:
- Technical obsolescence
- Lack of Standards
- Expense: Creating and maintaining the infrastructure, equipment and trained people will be an ongoing, costly and resource-heavy activity, but it is vital to ensure the availability of digital collections to future generations. An efficient digital preservation strategy "cannot be achieved with sporadic investment, but must be supported by regular and predictable funding"
The document also describes benefits, opportunities, stakeholders and strategic priorities. The strategic objectives include:
- To maintain the Archive to the highest international standards
- To improve resources within the Archive
- To improve access to the collections
- To heighten awareness of the collections for current and future generations
Digital Preservation & Access Strategy Document will be an essential tool in helping the organization become a Digital Archive that meets internationally recognized standards of excellence. This will be a living document and will reviewed regularly to ensure it still meets the strategic objectives. It will take into account any changes in access or preservation technology, policies, and budget changes. The document will also be supplemented by additional policies over time.
How Do Users Search and Discover?
How Do Users Search and Discover? Christine Stohn. Ex Libris. May 2015.
This paper describes the findings of a recent Ex Libris user study and discusses how the conclusions apply to library discovery systems. The challenging goal for libraries is to enable libraries to bring together users, their research intent and needs, and the wealth of available information. Intelligence about users’ behavior, intent, and expectations in the context of the tasks that the users strive to accomplish is of key importance in endeavoring to achieve this goal. User studies are not the same as usability studies. The core functionality of a discovery system is search and find, but different users have different needs.
This paper describes the findings of a recent Ex Libris user study and discusses how the conclusions apply to library discovery systems. The challenging goal for libraries is to enable libraries to bring together users, their research intent and needs, and the wealth of available information. Intelligence about users’ behavior, intent, and expectations in the context of the tasks that the users strive to accomplish is of key importance in endeavoring to achieve this goal. User studies are not the same as usability studies. The core functionality of a discovery system is search and find, but different users have different needs.
Tuesday, June 16, 2015
Digital Preservation links
Digital Preservation links . Julie C. Swierczek. scholar.harvard.edu website. June 2015.
A very thorough and up to date list of digital preservation resources on the internet. There are many topics, such as:
A very thorough and up to date list of digital preservation resources on the internet. There are many topics, such as:
- Digitization
- Digital Forensics
- Digital Accessioning Workflows
- Guidelines/best practices
- Reports/whitepapers/publications/bibliographies/journals
- Metadata, vocabularies, XML and harvesting
- Standards
- Projects of interest
- Blogs
- Web archiving
Monday, June 15, 2015
Digital Preservation 101, or, How to Keep Bits for Centuries
Digital Preservation 101, or, How to Keep Bits for Centuries. Julie C. Swierczek. scholar.harvard.edu website. June 4, 2015.
An interesting presentation that looks at what digital archivists need to preserve electronic records permanently. It discusses:
An interesting presentation that looks at what digital archivists need to preserve electronic records permanently. It discusses:
- OAIS model for long-term digital preservation;
- requirements of a trustworthy digital repository;
- preferred file formats for long-term storage;
- digital forensics and FRED machines;
- definitions and differences of “archives” and “backup copies”.
Preserving the Born-Digital Record: Many more questions than answers
Preserving the Born-Digital Record: Many more questions than answers. James G. Neal. American Libraries. May 28, 2015.
The world is producing vast amounts of born-digital material. The volume, complexity, and dynamism of this information challenge us to think creatively about its capture, organization, and long-term preservation and usability. What is the role of the library? Is this a source of failure or opportunity for the global library community?
This is an issue of integrity, of the collective adherence to a code and standard of values, of maintaining human records as complete, unimpaired, and undivided as possible. The ability to consult the evidence and sources used by researchers and authors will be lost if those digital records are not available. The ability to research and investigate the history and current state of our world will be compromised if born-digital materials are gone or changed. The ability to access the sources of record will be difficult if they are deposited and dispersed into multiple and disparate sites. This is the challenge of repository chaos.
At the core of born-digital content preservation and archiving are four principles.
Quality equals content plus functionality. To make sure that the born-digital content is preserved and usable in the long term we must understand and accommodate the important characteristics of digital information. With born-digital resources we must also consider the relationship among form, text, and function, context, renderability, and versioning over time. "We see the inevitability of physical and format obsolescence, the importance of authenticity and provenance, and the role of standards such as globally unique identifiers."
The scope, depth, and cost of the threat mean that individual libraries adequately preserve born-digital content alone. We need to promote cooperation and new public–private partnerships. The Digital Preservation Network (DPN) is an example of this. "We will not have the technologies, tools, workflows, or standards unless we work together in new ways."
Libraries must take on responsibility for the preservation of born-digital content.
The world is producing vast amounts of born-digital material. The volume, complexity, and dynamism of this information challenge us to think creatively about its capture, organization, and long-term preservation and usability. What is the role of the library? Is this a source of failure or opportunity for the global library community?
This is an issue of integrity, of the collective adherence to a code and standard of values, of maintaining human records as complete, unimpaired, and undivided as possible. The ability to consult the evidence and sources used by researchers and authors will be lost if those digital records are not available. The ability to research and investigate the history and current state of our world will be compromised if born-digital materials are gone or changed. The ability to access the sources of record will be difficult if they are deposited and dispersed into multiple and disparate sites. This is the challenge of repository chaos.
At the core of born-digital content preservation and archiving are four principles.
- We cannot preserve what we have not collected.
- We must enable access, which brings persistence.
- We must secure and curate the content.
- We must take care of the content as steward.
Quality equals content plus functionality. To make sure that the born-digital content is preserved and usable in the long term we must understand and accommodate the important characteristics of digital information. With born-digital resources we must also consider the relationship among form, text, and function, context, renderability, and versioning over time. "We see the inevitability of physical and format obsolescence, the importance of authenticity and provenance, and the role of standards such as globally unique identifiers."
The scope, depth, and cost of the threat mean that individual libraries adequately preserve born-digital content alone. We need to promote cooperation and new public–private partnerships. The Digital Preservation Network (DPN) is an example of this. "We will not have the technologies, tools, workflows, or standards unless we work together in new ways."
Libraries must take on responsibility for the preservation of born-digital content.
Friday, June 12, 2015
TIFF/A Standard Initiative
TIFF/A Standard Initiative. Website. June, 2015.
The TIFF/A standard initiative intends to create an ISO specification of a Archival TIFF Format. TIFF is a widely used format, but it is complex and has some features not suited for long term preservation. The TIFF/A-specification will be enhanced with mandatory and forbidden tags for archival purposes, similar to PDF/A. "This standard will be created in parallel with DPF Manager, an open source TIFF format validator that, in addition to the current TIFF ISO Standards, will be the first conformance checker for the TIFF/A new standard." The group looks to create a community of experts interested in discussing the initiative in order to prepare a proposal to submit to the ISO.
The TIFF/A standard initiative intends to create an ISO specification of a Archival TIFF Format. TIFF is a widely used format, but it is complex and has some features not suited for long term preservation. The TIFF/A-specification will be enhanced with mandatory and forbidden tags for archival purposes, similar to PDF/A. "This standard will be created in parallel with DPF Manager, an open source TIFF format validator that, in addition to the current TIFF ISO Standards, will be the first conformance checker for the TIFF/A new standard." The group looks to create a community of experts interested in discussing the initiative in order to prepare a proposal to submit to the ISO.
File identification tools, part 3: DROID and PRONOM
File identification tools, part 3: DROID and PRONOM. Gary McGath. File Formats Blog. June 1, 2015.
DROID (Digital Record Object IDentification) is an open sourced Java-based tool from the UK National Archives that is designed to identify and verify files for digital repositories. It relies on file format information from the National Archive’s registry, which uses a tool called PRONOM. "DROID depends on files that describe distinctive data values for each format". It can verify single files or large batches of files, or it can be integrated into other applications. DROID generates reports about the file and the identify and verification, or report if it can't match the type of file. Sometimes it may report that a file has more than one matching signature, such as if there is more than one version of a format.
DROID (Digital Record Object IDentification) is an open sourced Java-based tool from the UK National Archives that is designed to identify and verify files for digital repositories. It relies on file format information from the National Archive’s registry, which uses a tool called PRONOM. "DROID depends on files that describe distinctive data values for each format". It can verify single files or large batches of files, or it can be integrated into other applications. DROID generates reports about the file and the identify and verification, or report if it can't match the type of file. Sometimes it may report that a file has more than one matching signature, such as if there is more than one version of a format.
Thursday, June 11, 2015
What We've Saved (2004-2014)
What We've Saved (2004-2014). Andy Jackson. UK Web Archive. June 11, 2015. [PPT slides]
After 10 years of the UK web archive, what has been saved? Three collections, over 8 billion resources, and 160 TB of compressed data. "Looking inward is not enough: To understand the value of our collection, we need to look beyond our walls and put it in context." A review shows how much has been lost from the web. Almost 100% of the crawled urls in the UK web archive, are gone or missing on the internet. And about 40% from 2013 is gone or missing. Link rot & content drift dominate:
Simple similarity measures provides some insights, but there needs to be more work to look for old content in new locations.

After 10 years of the UK web archive, what has been saved? Three collections, over 8 billion resources, and 160 TB of compressed data. "Looking inward is not enough: To understand the value of our collection, we need to look beyond our walls and put it in context." A review shows how much has been lost from the web. Almost 100% of the crawled urls in the UK web archive, are gone or missing on the internet. And about 40% from 2013 is gone or missing. Link rot & content drift dominate:
- 50% of resources unrecognisable or gone after 1 year
- 60% after 2 years, 65% after 3 years (islands of stability)
- Noticeably higher rot rate than results for legal/academic web
Simple similarity measures provides some insights, but there needs to be more work to look for old content in new locations.

Libraries and Research Data Services
Libraries and Research Data Services. Megan Bresnahan, Andrew Johnson. University of Colorado Boulder. 2014. [PDF]
A presentation that looks at the importance of training librarians to become experts in research data services (RDS). “Reassigning existing library staff is the most common tactic for offering RDS. This approach also needs to be supported with professional development for staff so they can gain the required expertise to provide the full range of RDS”
Some feedback from Subject Librarians shows that they know it is part of their duties and that it is becoming more important, but that it is difficult to accomplish in the current environment:
A presentation that looks at the importance of training librarians to become experts in research data services (RDS). “Reassigning existing library staff is the most common tactic for offering RDS. This approach also needs to be supported with professional development for staff so they can gain the required expertise to provide the full range of RDS”
Some feedback from Subject Librarians shows that they know it is part of their duties and that it is becoming more important, but that it is difficult to accomplish in the current environment:
- “Research data is intimidating!”
- “How can I take on research data support with so much else already on my plate?!”
- “I need practical tools to use to help researchers with their data”
- “Helping faculty and students with their data is an increasingly important part of my liaison duties”
- Understand the stages
- Define the role
- Apply skills
- Plan for outreach
- Feel confident
- Engage with researchers
Wednesday, June 10, 2015
National Library of the Netherlands-Portico Partnership at the Forefront of Digital Preservation and International Collaboration
National Library of the Netherlands-Portico Partnership at the Forefront of Digital Preservation and International Collaboration. Portico Press release. 28 May 2015.
The National Library of the Netherlands (KB) and Portico created a new partnership that will preserve e-journals through the KB’s e-Depot program, which preserves locally published content in the Netherlands. The program focuses on the preservation of e-journals from international scientific publishers and “At the KB, our preservation strategy has been to use a variety of solutions and tools, and not rely on any single method.” Portico has been a key collaborator in their preservation work over the years.
The e-Depot already preserves more than 15 million e-journal articles. Portico will provide the KB with preservation-formatted e-journal content from scientific publishers who are new to the program including BioOne, Walter de Gruyter, Wolters Kluwer, Karger, Brill, and Thieme. An e-journal trigger event would allow the KB would provide journal access to Dutch researchers.
The National Library of the Netherlands (KB) and Portico created a new partnership that will preserve e-journals through the KB’s e-Depot program, which preserves locally published content in the Netherlands. The program focuses on the preservation of e-journals from international scientific publishers and “At the KB, our preservation strategy has been to use a variety of solutions and tools, and not rely on any single method.” Portico has been a key collaborator in their preservation work over the years.
The e-Depot already preserves more than 15 million e-journal articles. Portico will provide the KB with preservation-formatted e-journal content from scientific publishers who are new to the program including BioOne, Walter de Gruyter, Wolters Kluwer, Karger, Brill, and Thieme. An e-journal trigger event would allow the KB would provide journal access to Dutch researchers.
Tuesday, June 09, 2015
FFmpeg's FFV1 lossless video codec: A Free Software success story
FFmpeg's FFV1 lossless video codec: A Free Software success story. Peter Bubestinger. 9 May, 2015. Published 2 June 2015.
The Österreichische Mediathek wanted to digitize video for long-term preservation and looked at the options. For audio formats they used uncompressed /lossless PCM (WAV). They expected to apply the same requirements to video, but uncompressed video was not an option because the file size was too large.Majority of the institutions preserving video use lossy codecs:
The FFV1 Video Codec Specification by Michael Niedermayer.
Lessons learned:
The Österreichische Mediathek wanted to digitize video for long-term preservation and looked at the options. For audio formats they used uncompressed /lossless PCM (WAV). They expected to apply the same requirements to video, but uncompressed video was not an option because the file size was too large.Majority of the institutions preserving video use lossy codecs:
- IMX, ProRes, MPEG-2, MPEG-4
- Industry proposed format: JPEG2000-lossless/PCM in MXF
- Double checked FFV1's long-term sustainability
- Tested its implementation
- Documented its usage
- Organized and funded improvement
The FFV1 Video Codec Specification by Michael Niedermayer.
Lessons learned:
- User perception is of great importance
- Provide graphical tools
- No one wants to be the first to try something out
- The look of a website is important. A "cheap looking website" gives an impression of a low quality product
- Provide documentation
- Make access easy
- Provide option of paid setup and support
Monday, June 08, 2015
New Sources and Storage Options For Rosetta
Rosetta Users Group 2015: New Sources and Storage Options For Rosetta. Chris Erickson. June 3, 2015. [PDF slides]
This is my presentation at the Rosetta's User Group / Advisory Group held this past week.
We installed Rosetta in March 2012 and have ingested a number of collections in the preservation repository. In addition to those sources and collections we have already set up to work with Rosetta, we have been working with some new areas. These include:
One of the things that everyone is struggling with is the lack of sufficient storage. Conventional storage is expensive and limited. So we are investigating alternative long term storage possibilities:
What we are looking for in preservation storage:
This is my presentation at the Rosetta's User Group / Advisory Group held this past week.
We installed Rosetta in March 2012 and have ingested a number of collections in the preservation repository. In addition to those sources and collections we have already set up to work with Rosetta, we have been working with some new areas. These include:
- University Academic electronic records in SharePoint. Rosetta harvest tool for SharePoint
- New Library repositories, such as Digital Commons
- Harvest Canon camera raw images and ingest into Rosetta
- Ingest University videos and digitized files from the Audio Digitization project
- Program to gather information from Unstructured folders of archival objects and ingest into Rosetta
One of the things that everyone is struggling with is the lack of sufficient storage. Conventional storage is expensive and limited. So we are investigating alternative long term storage possibilities:
- Develop a Proof of Concept Project to ingest Rosetta content in DPN
- Amazon S3 Cloud Storage from Rosetta. Easy to connect to Rosetta.
- Hitachi - LG Data Storage (HLDS) Optical Archive System
- Long Term storage system
- Lowest storage cost
- Single rack unit holds 1 PB of permanent storage; unlimited expansion
- Currently testing with our Rosetta system
- Reduce the need for refreshing or migrating content
- Plan to use with Millenniata M-Discs in the archive
What we are looking for in preservation storage:
- Sufficient capacity for our ever increasing content
- Reasonable long term cost
- Lower total costs of ownership
- Reduced cost of refreshing or migration
- Reliable and recoverable
- Archival media
- Industry Storage Partner
- Multiple copies, locations
- Secure storage
- Accessible
- Rosetta
- Network
Thursday, June 04, 2015
Writing in the Sand
Writing in the Sand. Chris Erickson. June 4, 2015.
Recently I have been sorting through some family photographs from long ago. One image, digitally born, was taken on a family holiday at a beach. In the sand we had written a birthday greeting to our son. The greeting, which only lasted a short time, was erased by the wind and waves. The snapshot is all that is left of the original message.
It reminded me of the problems we now face regarding digital content. Like writing on the beach, our digital data is based on electric or magnetic charges on wafers of sand. The messages are likewise temporary, erased before long by the elements.
There are lots of problems encountered in preserving digital content, but it seems the first problem is that we entrust our permanent messages to temporary media. We then rely on complex procedures to lengthen the life of the temporary media. Or wring our hands when the media fail. What do we expect when we use magnetic based media while sitting on the world's largest magnet?
What will help? I suggest that we change our mind set and stop writing our precious, permanent content to temporary media. (And I am not suggesting that we go towards unique one-of solutions that must be created in a lab at enormous cost.) There are few options that can be used, but if we decide to make this our goal there will be more innovation in the area. One option that does exist right now is the M-Disc technology. It was designed at our university with permanent storage in mind; cost effective digital storage that can be created by anyone that does not degrade over time and is unaffected by the elements. There will be other permanent solutions for storage, metadata, and access if we look for them.
Let's decide to move towards permanence and find solutions that last more than a few years and do not need constant care.
Recently I have been sorting through some family photographs from long ago. One image, digitally born, was taken on a family holiday at a beach. In the sand we had written a birthday greeting to our son. The greeting, which only lasted a short time, was erased by the wind and waves. The snapshot is all that is left of the original message.
It reminded me of the problems we now face regarding digital content. Like writing on the beach, our digital data is based on electric or magnetic charges on wafers of sand. The messages are likewise temporary, erased before long by the elements.
There are lots of problems encountered in preserving digital content, but it seems the first problem is that we entrust our permanent messages to temporary media. We then rely on complex procedures to lengthen the life of the temporary media. Or wring our hands when the media fail. What do we expect when we use magnetic based media while sitting on the world's largest magnet?
What will help? I suggest that we change our mind set and stop writing our precious, permanent content to temporary media. (And I am not suggesting that we go towards unique one-of solutions that must be created in a lab at enormous cost.) There are few options that can be used, but if we decide to make this our goal there will be more innovation in the area. One option that does exist right now is the M-Disc technology. It was designed at our university with permanent storage in mind; cost effective digital storage that can be created by anyone that does not degrade over time and is unaffected by the elements. There will be other permanent solutions for storage, metadata, and access if we look for them.
Let's decide to move towards permanence and find solutions that last more than a few years and do not need constant care.
Wednesday, June 03, 2015
PREMIS Data Dictionary for Preservation Metadata: Approved Changes for version 3.0
PREMIS Data Dictionary for Preservation Metadata: Approved Changes for version 3.0. Library of Congress. November 24, 2014.
The following changes were approved for PREMIS version 3.0:
The following changes were approved for PREMIS version 3.0:
- Make Intellectual Entity another category of Object.
- Define preservationLevelType to indicate the type of preservation functions expected to be applied to the object for the given Preservation level.
- AgentVersion was added to record the version of software Agents
- The data model was changed so that Environments can be described and preserved reusing the Object entity
- Physical Objects can be described as representations and be related to digital objects.
- A value of unknown will be added to compositionLevel and format if the information is not available.
Tuesday, June 02, 2015
Data Archives and Digital Preservation
Data Archives and Digital Preservation. Council of European Social Science Data Archives. June 1, 2015.
Data Archives and Digital Preservation Data archives play a central role in research. Data is considered “the new gold”. There is increasing pressure on researchers to manage, archive, and share their datadata archives. It is important to securely store research data, and to allow researchers to reuse data in their own analyses or teaching.
Archives are much more than just a storage facility; they actively curate and preserve research data. They must have suitable strategies, policies, and procedures to maintain the usability, understandability and authenticity of the data. There are also numerous requirements from users, data producers, and funders. In the social science research data preservation and sharing, archives have the added responsibility of protecting the human subjects of the research.
The CESSDA site has many resources. Some of these are:
Data Archives and Digital Preservation Data archives play a central role in research. Data is considered “the new gold”. There is increasing pressure on researchers to manage, archive, and share their datadata archives. It is important to securely store research data, and to allow researchers to reuse data in their own analyses or teaching.
Archives are much more than just a storage facility; they actively curate and preserve research data. They must have suitable strategies, policies, and procedures to maintain the usability, understandability and authenticity of the data. There are also numerous requirements from users, data producers, and funders. In the social science research data preservation and sharing, archives have the added responsibility of protecting the human subjects of the research.
The CESSDA site has many resources. Some of these are:
- What is digital preservation
- OAIS
- Data appraisal and ingest
- Documentation and metadata
- Access and reuse
- Trusted digital repositories: audit and certification.
Subscribe to:
Posts (Atom)