Floppy disks and modern gadgets: Keeping a safe distance. Isaiah Beard. Page2Pixel. March 25th, 2016.
In preserving older, born-digital documents and data, a common situation is that people seek help to migrate data from old floppy disks, and sometimes they are not careful with what they put next to the disks. People who used to use floppy disks and other magnetic media understood the need to be careful about keeping them away from other things that generated magnetic fields. Data on a floppy disk is stored magnetically, which made them very sensitive to magnetic and electromagnetic fields. Today's storage media, USB drives, memory cards, optical discs and such, are not susceptible to magnetic fields. Cell phones, tablets, and other equipment often have strong magnets in them. It is "important to remind people that, should they come across an old floppy disk, and they would like to save the data, they must be careful where they place it, and what newer devices come into contact with it. Old floppies should be kept as far away from strong magnets as possible. And smartphones, tablets and even modern laptops shouldn’t come within 6 inches of floppies or any other magnetic media that could be easily erased." In addition, 3.5" floppy drives are becoming harder to find.
This blog contains information related to digital preservation, long term access, digital archiving, digital curation, institutional repositories, and digital or electronic records management. These are my notes on what I have read or been working on. I enjoyed learning about Digital Preservation but have since retired and I am no longer updating the blog.
Thursday, March 31, 2016
Tuesday, March 29, 2016
Some Assembly Required – Micro-services and Digital Preservation
Some Assembly Required – Micro-services and Digital Preservation. Danielle Spalenka. POWRR Blog. March 22, 2016.
Very informative article about how micro services and tools can benefit libraries of all sizes and financial abilities. Many struggle with implementing a digital preservation infrastructure. University of California created a set of free-standing but inter-operable applications that performed a single or limited number of tasks in the larger curation and preservation process, which they described as a micro-services approach.
This approach, which did not require the installation of a single, long-lived application, can help medium-sized and smaller institutions to identify and achieve digital preservation goals. The set of twelve independent but compatible micro-services performed preservation functions such as identity, storage, fixity, replication, inventory, ingest, index, search, transformation, notification and annotation. These simple utilities would pose fewer challenges in their development, deployment, maintenance and enhancement than a large, integrated system. The strategic combination of individual services could produce “the complex global function needed for effective curation” at large institutions. The Digital POWRR (Preserving Digital Resources with Restricted Resources) Project team begin a study of the problems and possible solutions for preserving digital objects.
Some understood that digital curation and preservation was an either/or issue: either an institution had implemented a digital preservation system or it had not. In reality, the preservation activities are "an ongoing, iterative set of actions, reactions, workflows, and policies." This means institutions can begin taking small steps rather than waiting to devise an ideal solution. The NDSA Levels of Digital Preservation provides a yardstick to measure progress toward a digital curation and preservation capacity. Two fundamental understandings at the heart of a micro-services approach are:
The use of individual tools performing discrete functions can help those starting preservation activities. The Digital POWRR Project has described the stages of a digital curation and preservation workflow and associated activities. Tools performing these functions are available at the COPTR web site.
Very informative article about how micro services and tools can benefit libraries of all sizes and financial abilities. Many struggle with implementing a digital preservation infrastructure. University of California created a set of free-standing but inter-operable applications that performed a single or limited number of tasks in the larger curation and preservation process, which they described as a micro-services approach.
This approach, which did not require the installation of a single, long-lived application, can help medium-sized and smaller institutions to identify and achieve digital preservation goals. The set of twelve independent but compatible micro-services performed preservation functions such as identity, storage, fixity, replication, inventory, ingest, index, search, transformation, notification and annotation. These simple utilities would pose fewer challenges in their development, deployment, maintenance and enhancement than a large, integrated system. The strategic combination of individual services could produce “the complex global function needed for effective curation” at large institutions. The Digital POWRR (Preserving Digital Resources with Restricted Resources) Project team begin a study of the problems and possible solutions for preserving digital objects.
Some understood that digital curation and preservation was an either/or issue: either an institution had implemented a digital preservation system or it had not. In reality, the preservation activities are "an ongoing, iterative set of actions, reactions, workflows, and policies." This means institutions can begin taking small steps rather than waiting to devise an ideal solution. The NDSA Levels of Digital Preservation provides a yardstick to measure progress toward a digital curation and preservation capacity. Two fundamental understandings at the heart of a micro-services approach are:
- digital curation and preservation is an uncertain process in which continuous, rapid technological change often renders monolithic, integrated applications cumbersome and outdated;
- simple tools focused on a specific aspect or aspects of the process can prove more helpful.
The use of individual tools performing discrete functions can help those starting preservation activities. The Digital POWRR Project has described the stages of a digital curation and preservation workflow and associated activities. Tools performing these functions are available at the COPTR web site.
Exploring appraisal, quality assurance and risk assessment in the data continuum
Exploring appraisal, quality assurance and risk assessment in the data continuum. Linda Ligios. Pericles blog. 8 March 2016.
PERICLES presented a workshop on "appraisal, quality assurance and risk assessment in relation to the lives of complex digital objects." It introduced the key concepts of :

Related topics:
PERICLES presented a workshop on "appraisal, quality assurance and risk assessment in relation to the lives of complex digital objects." It introduced the key concepts of :
- model-driven preservation in a continually evolving environment
- appraisal processes that lend themselves to being automated,
- development plans for tools on appraisal, risk assessment and quality assurance
- Risk – probability of an entity being non-usable
- Proximity – time frame in which we consider risk/impact
- Impact – potential loss of functionality and cost of mitigating actions
Related topics:
- PERICLES: towards model-driven preservation. Poster by Jean-Yves Vion-Dury.
- PERICLES Environment Extraction Project
Monday, March 28, 2016
Translating theory to practice : defining digital preservation planning in museums
Translating theory to practice : defining digital preservation planning in museum. Emma Palakika James. Thesis, San Francisco State University. January 2016. [PDF, 292 pp.]
A very interesting thesis looking at digital preservation as an emerging activity in museums today. Some of the chapters discuss Threats to Digital Objects; What is Digital Preservation? (including the basics, OAIS, Trusted Digital Repository, methods; case studies, and Digital Preservation Policy. Some notes of interest:
- The development of technology as a tool for work, research, information capture, and artistic expression, as well as the increasing percentage of important cultural materials created only in digital form, argues that museums must begin to focus on digital preservation.
- Four key themes are discussed, including
- defining digital preservation,
- integration of digital preservation technology,
- collaboration, and
- policy development
- currently, digital preservation remains a new, and not-broadly practiced activity in museums.
- The practice of digital preservation will become increasingly important to the museum field, and should be considered with the same responsibility and effort as traditional museum collection management.
- If museums are going to continue their role as well-equipped stewards for the cultural heritage of today and of our future, then digital preservation will need to be adopted within the broader scope of museum work.
- digital preservation in a museum context must be viewed and implemented from a collections management perspective.
- In some form or another, eventually all museums will adopt digital technology into their institutional assets, museum archives, and museum collections, all of which will continually be expected to be cared for and preserved just as long as any analog collections.
- While digitization was the very beginning of increased public access to collections, digital preservation is simply the flip side of ensuring ongoing access — providing consistent entry to information that is already manifested in digital form.
- If access to collections is becoming a mainstreamed part of the Museum’s responsibility, the ongoing access to born-digital institutional assets is also certainly worthy of consideration.
- Storage has been the default long term preservation strategy used by museums for traditional collections, but it is the shortest-term solution for new media
- American Institute for Conservation of Historic and Artistic Works: “every institution has a responsibility to safeguard the collections that are entrusted to it. That responsibility includes incorporating preservation and conservation awareness into all facets of the institution’s activities so as to ensure the long-term preservation of its collections”
- active digital preservation tactics should be accessible, manageable, and realistic solutions.
- Collaboration is necessary to ensure preservation and especially if the museum field wants to improve itsstewardship of digital materials
- collaboration between libraries, archives, and museums will be a critical factor for whether the greater museum field can achieve digital preservation to the level of a ‘Trusted Digital Repository,’ which arguably, is the ideal level of preservation for medium to long-term stewardship.
- Together, LAMs can ensure that ‘bom-digital’ documents and artifacts become integrated into the cultural record through various levels of digital preservation activity that will help to keep them accessible, and to become a permanent part of the cultural memory of future generations.
- most museum collection management policies do not address the issues of digital preservation or digital stewardship.
- the need to draft and implement a digital preservation policy is of equal importance to that of collection management policy for a museum.
- a full-formed policy is a way for the staff, and hopefully eventually upper management, to organize the overall mission, goals, scope, staff roles, and basic procedures. This may help better define how the staff can tackle digital preservation, making it a less intimidating process and to also document its official initiation
- the technology the three case studies falls into three categories: digital asset management systems, OAIS compliant software, and storage media.
- is a digital asset is not a digital preservation system in of itself, because a DAMS does not usually follow the specific recommendations for metadata, fixity checks, and formats that are put forth by Trusted Digital Repository model, OAIS and other standards
- Digital preservation systems ultimately are a set of processes, protocols, and policies that are most often mediated with some technological aspect to aid in creating information packages suitable for long-term storage
- Digital preservation is not just a technology problem, but it is a management issue.
- Policy is a tangible method for institutions to outline the management support of their preservation activities
The thesis includes five conclusions concerning the state of digital preservation in museums:
- preservation is possible;
- standards, guidelines, and best practices are already available, but use wisely;
- embrace new practices in policy;
- collaboration will be key for success; and
- embrace change and act now.
Saturday, March 26, 2016
Caring for file formats
Caring for file formats. Ange Albertini. Presentation at Troopers 2016. March 17, 2016. [PDF]
The risk to preserving digital objects is very high. The "attack surface with file formats is too big". The specifications of formats are a nice guide, but they don't represent reality; they are useless for managing the formats. "We can’t deprecate formats because we can’t preserve and we can’t define how they really work."
The formats need good documentation to show the landscape and "to express the reality of file format". Once they are better understood, then "we can preserve and deprecate older format, which reduces attack surface". Then people can focus on making the present formats more secure.
What is a file format? A computer dialect to communicate between communities; file formats are community connectors. People don't care about the format itself, they care about the characteristics and how easy it is to use. We don't need new formats, since reality will diverge from the specs anyway. The need is for up to date, traceable specs. Formats are constantly being updated with new features added. That doesn't solve the problem. Specs should reflect reality and be "updated, enforced, realistic, freely available". Deprecation is a natural cycle, but are afraid to deprecate because "no file format is fully preserved". Formats should be open and the specs kept up to date. But it won’t happen until "we experience a great disaster".
The risk to preserving digital objects is very high. The "attack surface with file formats is too big". The specifications of formats are a nice guide, but they don't represent reality; they are useless for managing the formats. "We can’t deprecate formats because we can’t preserve and we can’t define how they really work."
The formats need good documentation to show the landscape and "to express the reality of file format". Once they are better understood, then "we can preserve and deprecate older format, which reduces attack surface". Then people can focus on making the present formats more secure.
What is a file format? A computer dialect to communicate between communities; file formats are community connectors. People don't care about the format itself, they care about the characteristics and how easy it is to use. We don't need new formats, since reality will diverge from the specs anyway. The need is for up to date, traceable specs. Formats are constantly being updated with new features added. That doesn't solve the problem. Specs should reflect reality and be "updated, enforced, realistic, freely available". Deprecation is a natural cycle, but are afraid to deprecate because "no file format is fully preserved". Formats should be open and the specs kept up to date. But it won’t happen until "we experience a great disaster".
Friday, March 25, 2016
National Archives permits us to learn from mistakes
National Archives permits us to learn from mistakes. Peter Charleton, Supreme Court judge. The Irish Times. Feb 8, 2016.
For the National Archives in Ireland, 1922 was a disaster. A direct hit from artillery destroyed centuries of records. The census records from 1821 through to 1891 were almost completely destroyed. Since then, the National Archives has tried to supplement its damaged holdings, but what has been lost is gone forever. With many places moving from paper to digital records history is on the point of repeating itself. The traditional policy of printing files to preserve a digital record no longer works. Files may be on several computers in several iterations; they may have "elements in office systems, email, even text messages or a tweet." With digital, there is a lot of data, which brings a challenge of what to preserve. But not every record needs preservation.
To ensure records the preservation of long-term records, the records should be transferred to the National Archives. Permanent records need to be identified early and treated appropriately. Creation of a digital archive will greatly reduce the volume of records that government departments store. "Millions are spent by departments on off-site storage and back-ups of network drives. By investing in a digital archive, departments will be able to transfer emails, business files, digital images and other electronic records to the National Archives. An efficient approach to records management based on legal obligation can target policy effectively." Money should be directed to the National Archives for developing an efficient system so there are sufficient resources to capture, manage and preserve our digital heritage. This institution, the precious repository of this nation, deserves to be supported in ensuring Ireland continues to have a history.
For the National Archives in Ireland, 1922 was a disaster. A direct hit from artillery destroyed centuries of records. The census records from 1821 through to 1891 were almost completely destroyed. Since then, the National Archives has tried to supplement its damaged holdings, but what has been lost is gone forever. With many places moving from paper to digital records history is on the point of repeating itself. The traditional policy of printing files to preserve a digital record no longer works. Files may be on several computers in several iterations; they may have "elements in office systems, email, even text messages or a tweet." With digital, there is a lot of data, which brings a challenge of what to preserve. But not every record needs preservation.
To ensure records the preservation of long-term records, the records should be transferred to the National Archives. Permanent records need to be identified early and treated appropriately. Creation of a digital archive will greatly reduce the volume of records that government departments store. "Millions are spent by departments on off-site storage and back-ups of network drives. By investing in a digital archive, departments will be able to transfer emails, business files, digital images and other electronic records to the National Archives. An efficient approach to records management based on legal obligation can target policy effectively." Money should be directed to the National Archives for developing an efficient system so there are sufficient resources to capture, manage and preserve our digital heritage. This institution, the precious repository of this nation, deserves to be supported in ensuring Ireland continues to have a history.
Thursday, March 24, 2016
The FAIR Guiding Principles for scientific data management and stewardship
The FAIR Guiding Principles for scientific data management and stewardship. Mark D. Wilkinson, et al. Nature. 15 March 2016. [PDF]
"There is an urgent need to improve the infrastructure supporting the reuse of scholarly data." Good data management is not a goal in itself, but a conduit leading to knowledge discovery, innovation and the reuse of the data. The current digital ecosystem prevents this, which is why the funding and publishing community is beginning to require data management and stewardship plans. "Beyond proper collection, annotation, and archival, data stewardship includes the notion of ‘long-term care’ of valuable digital assets" so they can be discovered and re-used for new investigations.
This article describes four foundational principles (FAIR) to guide data producers and publishers:
"There is an urgent need to improve the infrastructure supporting the reuse of scholarly data." Good data management is not a goal in itself, but a conduit leading to knowledge discovery, innovation and the reuse of the data. The current digital ecosystem prevents this, which is why the funding and publishing community is beginning to require data management and stewardship plans. "Beyond proper collection, annotation, and archival, data stewardship includes the notion of ‘long-term care’ of valuable digital assets" so they can be discovered and re-used for new investigations.
This article describes four foundational principles (FAIR) to guide data producers and publishers:
- Findability,
- assigned a globally unique and persistent identifier
- data are described with rich metadata
- metadata clearly include the identifier of the data it describes
- data are registered or indexed in a searchable resource
- Accessibility,
- data are retrievable by their identifier using a standardized communications protocol
- the protocol is open, free, and universally implementable
- the protocol allows for an authentication and authorization procedure,
- metadata are accessible, even when the data are no longer available
- Interoperability,
- data use a formal, accessible, shared, and broadly applicable language for knowledge representation
- data use vocabularies that follow FAIR principles
- data include qualified references to other (meta)data
- Reusability
- meta(data) are richly described
- (meta)data have a clear data usage license
- (meta)data have a detailed provenance
- (meta)data meet community standards
Wednesday, March 23, 2016
New Report on Web Archiving Available
New Report on Web Archiving Available. Andrea Goethals. IIPC. 21 March 2016.
Harvard Library recently released a report to:
Harvard Library recently released a report to:
- explore and document current web archiving programs
- identify common practices, needs, and expectations in the collection of web archives
- identify the provision and maintenance of web archiving infrastructure and services;
- identify the use of web archives by researchers.
- Dedicate full-time staff to work in web archiving to keep up on latest developments, best practices and be part of the web archiving community.
- Conduct outreach, training and professional development for existing staff who are being asked to collect web archives.
- Institutional web archiving programs should be transparent about holdings, terms of use, preservation commitment, are curatorial decisions made for each capture.
- Develop a collection development tool to show holdings information to researchers and other collecting institutions.
- Train researchers to be able to analyze big data found in web archives.
- Establish a standard for describing the curatorial decisions behind collecting web archives.
- Establish a feedback loop between researchers and the librarians/archivists.
Tuesday, March 22, 2016
Do your Digital Records have an Expiration Date?
Do your Digital Records have an Expiration Date? Jon Tilbury. Information Management. March 22, 2016.
A general article about the importance of digital preservation. Some quotes of interest:
A general article about the importance of digital preservation. Some quotes of interest:
- As more “born digital” content is produced every day, requirements get more complex, and the need for organization-wide digital preservation strategies becomes greater.
- the consensus is that the “tipping-point” for accessing digital data is not 100 years after it is stored, but more realistically, around 10 years.
- Ten years is a more realistic time-frame to consider when planning protection for critical and unique digital information assets.
- Building on reliable storage, digital preservation adds tools to accurately identify which formats are being used, pinpoint those at risk, and reliably recycle these into newer formats that can be read.
- For many organizations, a proactive approach to safeguarding critical long-term digital records t using digital preservation technology is fast becoming a critical part of the overall information governance lifecycle.
Monday, March 21, 2016
From digital dark age to digital enlightenment
From digital dark age to digital enlightenment. Caroline Pegden. National Archives. 17 February 2016.
Recent media reports have talked about the ‘digital Dark Age‘. This is a major challenge, now and for the years to come for institutions in the archives sector, who are concerned with managing, preserving and providing access to born-digital records. This is important for the UK National Archives because some government departments will soon transfer born-digital records to The National Archives under the Public Records Act. As the National Archives has been working on how to do this, their philosophy has been ‘learning by doing’. They have reviewed what other archival institutions around the world are doing to manage digital records and have been testing the process of transfers "to design and test the new process to appraise, select, sensitivity review, transfer, preserve and give access to born-digital records." Two major challenges are:
Recent media reports have talked about the ‘digital Dark Age‘. This is a major challenge, now and for the years to come for institutions in the archives sector, who are concerned with managing, preserving and providing access to born-digital records. This is important for the UK National Archives because some government departments will soon transfer born-digital records to The National Archives under the Public Records Act. As the National Archives has been working on how to do this, their philosophy has been ‘learning by doing’. They have reviewed what other archival institutions around the world are doing to manage digital records and have been testing the process of transfers "to design and test the new process to appraise, select, sensitivity review, transfer, preserve and give access to born-digital records." Two major challenges are:
- extracting meaning from unstructured digital record collections in order to make appraisal and selection decisions.
- sensitivity reviewing born-digital records at scale without having to read all the individual documents
- The digital landscape in government 2014-15: business intelligence review
- The application of technology-assisted review to born-digital records transfer, Inquiries and beyond: research report
How many of the EOT2008 PDF files were harvested in EOT2012
How many of the EOT2008 PDF files were harvested in EOT2012. Mark Phillips. mark e. phillips journal. February 23, 2016.
Post aabout the author looking at some of the data from the End of Term 2012 Web Archive snapshot at the UNT Libraries. From the EOT2008 Web archive 4,489,675 unique (by hash) PDF files were extracted and then compared recently to see how many of those nearly 4.5 million PDFs were still around in 2012 when they crawled the federal Web again as part of the EOT2012 project. The findings:
After the numbers finished running, it looks like the following.
PDFs Percentage
Found 774,375 17%
Missing 3,715,300 83%
Total 4,489,675 100%
So 83% of the PDF files that were present in 2008 are not present in the EOT2012 Archive. It is possible that the items are still available at a different URL entirely in 2012 when it was harvested again. So the URL might not be available but the content could be available at another location.
Post aabout the author looking at some of the data from the End of Term 2012 Web Archive snapshot at the UNT Libraries. From the EOT2008 Web archive 4,489,675 unique (by hash) PDF files were extracted and then compared recently to see how many of those nearly 4.5 million PDFs were still around in 2012 when they crawled the federal Web again as part of the EOT2012 project. The findings:
After the numbers finished running, it looks like the following.
PDFs Percentage
Found 774,375 17%
Missing 3,715,300 83%
Total 4,489,675 100%
So 83% of the PDF files that were present in 2008 are not present in the EOT2012 Archive. It is possible that the items are still available at a different URL entirely in 2012 when it was harvested again. So the URL might not be available but the content could be available at another location.
Saturday, March 19, 2016
Preservation Watch
Preservation Watch. Barbara Sierman. DPC wiki. 12 February 2016.
Preservation Watch is a well-accepted concept that was first created during the European PLANETS project by Barbara Sierman and Paul Wheatley. It goes beyond the OAIS monitoring activities and helps provide a better description of the Preservation Planning Functional Entity. The Planets Functional Model identifies 3 key preservation functions:
Preservation Watch monitors internal and external entities, including the repository content. It deals with both the OAIS Administration and Preservation Planning areas. The Preservation Watch has 4 sub-functions:
Preservation Watch is a well-accepted concept that was first created during the European PLANETS project by Barbara Sierman and Paul Wheatley. It goes beyond the OAIS monitoring activities and helps provide a better description of the Preservation Planning Functional Entity. The Planets Functional Model identifies 3 key preservation functions:
- Preservation Watch,
- Preservation Planning and
- Preservation Action.
Preservation Watch monitors internal and external entities, including the repository content. It deals with both the OAIS Administration and Preservation Planning areas. The Preservation Watch has 4 sub-functions:
- Monitor: collates preservation information from a variety of internal and external entities.
- Risk Analysis: assessment of this information, relaying critical risks to Preservation Planning.
- Representation Information Update: provides updates, including recording Risks and Executed Preservation Plans.
- Testbed: a controlled environment for studying the operation of tools and services which will inform the Preservation Planning activities.
Friday, March 18, 2016
'A' is for AtoM
'A' is for AtoM. Jenny Mitcham. Digital Archiving at the University of York. 18 March 2016.
Jenny has been working on an implementation of Access to Memory (AtoM) for a couple of years and provides an interesting list of information about it, the A to Z* of implementing AtoM. "It turns out that deciding to adopt a system is relatively simple, working out exactly how you are going to use it is far more complex!" A few that I found that apply in most software situations:
Jenny has been working on an implementation of Access to Memory (AtoM) for a couple of years and provides an interesting list of information about it, the A to Z* of implementing AtoM. "It turns out that deciding to adopt a system is relatively simple, working out exactly how you are going to use it is far more complex!" A few that I found that apply in most software situations:
- B is for Business as Usual: Any organisation when adopting a new and complex system like AtoM needs to think beyond initial implementation and consider how the solution can be embedded into their workflows for the longer term?
- E is for Experimenting: We discovered that data may not always import in the way you expect.
- J is for Just Start!: Reading the documentation is essential but testing and experimenting with AtoM are really the best ways of working it out.
- N is for Not Perfect: AtoM (like all complex systems) has its limitations.
- Q is for Quality: In an ideal world, all our data within AtoM would be of a high quality....but we do not live in an ideal world. Accepting that legacy data will not always meet current standards or be as accurate as we would like is key to moving forward with a system such as this.
- T is for Training: Training is not just a one off exercise.
Applying DP Standards For Assessment & Planning
Applying DP Standards For Assessment & Planning. Bertram Lyons. PASIG 2016. March, 2016.
ISO 16363:2012. Audit & Certification of Trustworthy Digital Repositories defines recommended practices for assessing the trustworthiness of digital repositories. The document will help those who audit repositories, but also those to design or redesign their digital repository processes. Some highlights from the standard:
3.1 Governance and Organizational viability: The repository shall have a collection policy or other document that specifies the type of information it will preserve, retain, manage, and provide access to. Without the policy the collection scope is unclear and it becomes difficult to say no to out of scope content. The standard expects a policy to exist and be documented.
4.2 Ingest: Creation of AIPs: Organizations should have a description of how AIPs are constructed from SIPs. It should document all changes to the processes, as well as defining what happens to the content (such as normalization of files, etc.)
5.2 Security Risk Management: The repository should have a written disaster preparedness and recovery plan, including at least one off-site backup of all preserved information together with an off-site copy of the recovery plan. This means the organization should be prepared administratively.
The elements are scored as follows
ISO 16363:2012. Audit & Certification of Trustworthy Digital Repositories defines recommended practices for assessing the trustworthiness of digital repositories. The document will help those who audit repositories, but also those to design or redesign their digital repository processes. Some highlights from the standard:
3.1 Governance and Organizational viability: The repository shall have a collection policy or other document that specifies the type of information it will preserve, retain, manage, and provide access to. Without the policy the collection scope is unclear and it becomes difficult to say no to out of scope content. The standard expects a policy to exist and be documented.
4.2 Ingest: Creation of AIPs: Organizations should have a description of how AIPs are constructed from SIPs. It should document all changes to the processes, as well as defining what happens to the content (such as normalization of files, etc.)
5.2 Security Risk Management: The repository should have a written disaster preparedness and recovery plan, including at least one off-site backup of all preserved information together with an off-site copy of the recovery plan. This means the organization should be prepared administratively.
The elements are scored as follows
- 0 - non-compliant or not started
- 1 - slightly compliant (needs a lot of work to do in address the requirement.
- 2 - half compliant: partially addressed but still significant work to do
- 3 - mostly compliant: mostly addressed and working on full compliance.
- 4 - fully compliant: can demonstrate the requirement is comprehensively addressed.
- Documentation: records of policy, procedure, and outcomes of activities
- Policy: the definition of approaches and protocol for repository functions and procedures
- Procedures: specification of preservation and infrastructure management activities
- Software: development or configuration of preservation systems
- Infrastructure: procurement, monitoring, and management of hardware infrastructure
- Organization: organizational infrastructure including funding, staffing, and strategy
- Action Plan
Thursday, March 17, 2016
Guidelines for the selection of digital heritage for long-term preservation
Guidelines for the selection of digital heritage for long-term preservation. UNESCO/PERSIST Content Task Force. March 2016.
Libraries, archives, and museums traditionally have the responsibility of preserving the intellectual and cultural resources produced by society but this is in jeopardy because of amount of information created every day in digital form. Digital content is doubling in size every two years.The digital content is also in danger because much of it is ephemeral; it lacks the longevity of physical objects. The challenge of keeping digital content "requires a rethinking of how heritage institutions identify significance and assess value". Institutions must be proactively identify and preserve digital heritage and information before it is lost. The role of libraries, archives, and museums are blurring in the digital age, but they still have major interests to preserve heritage.
Libraries face the challenge selecting digital content for long-term preservation. Many focus on short term use content already in their collection, rather than assessing new publications for acquisition. Archives have traditionally "relied on the passage of time between their creation and their acquisition by an archive to lend historical perspective in making selection decisions". However, the time frame for selecting content is shorter now since the rapid obsolescence of digital formats, storage media, system hardware and software systems, of opportunity of selection. Some strategies for selecting digital content:
Acting locally 1: Strategies for collecting digital heritage.
Acting locally 2: Developing selection criteria for a single institution
Appendix 1: Management of long-term digital preservation and metadata. If the digital heritage is the “content”, then the metadata provides the “context”.
"Selection of digital heritage is closely connected with issues related to long-term preservation and access. Some losses of important digital heritage may be unavoidable, but the risk can be mitigated by following best practices in digital preservation, including redundancy, active management, and metadata management."
Three key types of metadata crucial to long-term preservation:
Five basic functional requirements for digital metadata:
Libraries, archives, and museums traditionally have the responsibility of preserving the intellectual and cultural resources produced by society but this is in jeopardy because of amount of information created every day in digital form. Digital content is doubling in size every two years.The digital content is also in danger because much of it is ephemeral; it lacks the longevity of physical objects. The challenge of keeping digital content "requires a rethinking of how heritage institutions identify significance and assess value". Institutions must be proactively identify and preserve digital heritage and information before it is lost. The role of libraries, archives, and museums are blurring in the digital age, but they still have major interests to preserve heritage.
Libraries face the challenge selecting digital content for long-term preservation. Many focus on short term use content already in their collection, rather than assessing new publications for acquisition. Archives have traditionally "relied on the passage of time between their creation and their acquisition by an archive to lend historical perspective in making selection decisions". However, the time frame for selecting content is shorter now since the rapid obsolescence of digital formats, storage media, system hardware and software systems, of opportunity of selection. Some strategies for selecting digital content:
Acting locally 1: Strategies for collecting digital heritage.
- Comprehensive collecting to acquire all of the material produced on a given subject area, time period, or geographic region.
- Representative sampling to capture a representative picture makes selection and preservation more manageable and less resource-intensive.
- Selecting material for addition to their collections based on specific criteria, such as
- Subject/Topic.
- Creator/Provenance.
- Type/Format.
- Institutions could defer selection by capturing all the digital heritage material now and apply selection criteria later.
Acting locally 2: Developing selection criteria for a single institution
How should institutions select, identify, and prioritize digital heritage before it is lost? Evaluating and assessing digital content should be based on the principles that underlie traditional selection, but include long term perspective for use and access as defined by its mandate and users.Decision Tree for Selection in an individual Institution
- Identification. Identify the material to be acquired or evaluated.
- Legal framework. Does the institution have a legal obligation to preserve the material?
- Application of three selection criteria to determine if content should be preserved: significance, sustainability, and availability
- Decision. make a decision based on the three items and then document the rationale and justification for the evaluation or decision.
Appendix 1: Management of long-term digital preservation and metadata. If the digital heritage is the “content”, then the metadata provides the “context”.
"Selection of digital heritage is closely connected with issues related to long-term preservation and access. Some losses of important digital heritage may be unavoidable, but the risk can be mitigated by following best practices in digital preservation, including redundancy, active management, and metadata management."
Three key types of metadata crucial to long-term preservation:
- Structural (required for the technical capacity to read digital content)
- Descriptive (containing bibliographic, archival, or museum contextual information, which can be system-generated or created by heritage professionals, content creators, and/or users)
- Administrative (documenting the management of a digital object while in its collection).
Five basic functional requirements for digital metadata:
- Identification: The metadata must identify each digital object uniquely and unambiguously.
- Location: The metadata must allow each digital object to be located and retrieved.
- Description: A description of digital object as well as data about the content and the context.
- Readability: Metadata about the structure, format and encoding of digital objects
- Rights management: Rights and conditions of use and restrictions must be recorded.
Wednesday, March 16, 2016
File identification ...let's talk about the workflows
File identification ...let's talk about the workflows. Jenny Mitcham. Digital Archiving at the University of York. 27 November 2015.
When adding files to a digital archive, an important questions is "What file formats have we got here?" Knowing this can:
[Our Rosetta system has a format library that handles these questions, as well as a user driven Format Working Group that helps resolve questions and interacts with PRONOM if there are questions, changes or new additions. - Chris]
When adding files to a digital archive, an important questions is "What file formats have we got here?" Knowing this can:
- determine the right software to open the file and view the contents
- start the conversation with the data provider about what formats are best to use for archiving
- discuss the risks on the format and define a migration pathway for preservation and/or access
- what should happen if ingested data can't be identified?
- should the curator/digital archivist be able to over-ride file identifications?
- what should happen if there is more than one possible identification for a file?
- is there a sustainable manual identification process if tools cannot identify a file?
- how to contribute to file format registries such as PRONOM
- is the digital preservation system configurable enough to resolve these questions?
[Our Rosetta system has a format library that handles these questions, as well as a user driven Format Working Group that helps resolve questions and interacts with PRONOM if there are questions, changes or new additions. - Chris]
Tuesday, March 15, 2016
The Digital Preservation Network (DPN) Has Launched and Is Accepting Content
The Digital Preservation Network (DPN) Has Launched and Is Accepting Content. Mary Molinaro. D-Lib Magazine. March/April 2016.
Several years ago a group of academic leaders examined the risk to future scholars if the digital output from academia is not properly preserved and felt that the risk of loss was very high if nothing was done to protect against natural disasters, technological failure, or institutional failure. They pledged to create a large-scale digital preservation service that is built to last beyond the life spans of individuals, technological systems, and organizations. After three years of work, the resulting Digital Preservation Network is open and is accepting content from members. Five preservation repositories make up the DPN network. They have varying technical architectures and replicate content and perform services to safeguard the content. Content from member institutions can be added to DPN through two sites: DuraCloud Vault and the Academic Preservation Trust. The deposited content is replicated to the other nodes (Hathitrust, the Texas Preservation Node, and the Stanford Digital Repository).
DPN operates as an independent organization under the umbrella of Internet2 and is currently examining ways to open up DPN to other kinds of members. More information is available at the DPN website.
Several years ago a group of academic leaders examined the risk to future scholars if the digital output from academia is not properly preserved and felt that the risk of loss was very high if nothing was done to protect against natural disasters, technological failure, or institutional failure. They pledged to create a large-scale digital preservation service that is built to last beyond the life spans of individuals, technological systems, and organizations. After three years of work, the resulting Digital Preservation Network is open and is accepting content from members. Five preservation repositories make up the DPN network. They have varying technical architectures and replicate content and perform services to safeguard the content. Content from member institutions can be added to DPN through two sites: DuraCloud Vault and the Academic Preservation Trust. The deposited content is replicated to the other nodes (Hathitrust, the Texas Preservation Node, and the Stanford Digital Repository).
DPN operates as an independent organization under the umbrella of Internet2 and is currently examining ways to open up DPN to other kinds of members. More information is available at the DPN website.
Monday, March 14, 2016
Archives and SharePoint
Archives and SharePoint. Heather Emily Roberts. HerArchivist. March 8, 2016.
Post that looks at "Is SharePoint (or other flexible cloud-based ERMS software) suitable for digital repositories of archives?" Some pros and cons of using SharePoint as a digital repository:
Pros:
[We use our harvest tool to import permanent SharePoint records into our Rosetta system - Chris.]
Related posts:
Post that looks at "Is SharePoint (or other flexible cloud-based ERMS software) suitable for digital repositories of archives?" Some pros and cons of using SharePoint as a digital repository:
Pros:
- Can lock documents against editing
- Tells you when documents were last accessed and by whom
- Will not serve long-term needs of accessibility or use of records
- Will not support migration requirements of archival records
- Will not guarantee integrity of archival records during software updates
- Does not conform to OAIS model
- Archive preservation practices are not standard
[We use our harvest tool to import permanent SharePoint records into our Rosetta system - Chris.]
Related posts:
Digital Preservation - Knowing where to start
Digital Preservation - Knowing where to start. Nik Stanbridge. Cloud Computing Intelligence.
23 February 2016.
Memory institutions face increasing demands on their collections, such as the need to manage costs better, provide access, or degradation of objects, which then require digital preservation. Some organizations already have a strategy and are digitally preserving their assets. Many though are only just starting to think about digital preservation and need to know where to start and how to implement digital preservation.
Digital preservation is the process of managing and storing digital files and associated metadata in a way that they will be accessible and usable in the future. The processes apply both objects that were originally created in digital form and to those that have been digitized. If you have a need to maintain digital objects then the first step is to define a strategy; understand what needs to be preserved and how. This includes information about the digital object. "It’s important to remember that digital preservation is as much about preserving the meaning and context of the asset as it is about preserving the asset itself."
23 February 2016.
Memory institutions face increasing demands on their collections, such as the need to manage costs better, provide access, or degradation of objects, which then require digital preservation. Some organizations already have a strategy and are digitally preserving their assets. Many though are only just starting to think about digital preservation and need to know where to start and how to implement digital preservation.
Digital preservation is the process of managing and storing digital files and associated metadata in a way that they will be accessible and usable in the future. The processes apply both objects that were originally created in digital form and to those that have been digitized. If you have a need to maintain digital objects then the first step is to define a strategy; understand what needs to be preserved and how. This includes information about the digital object. "It’s important to remember that digital preservation is as much about preserving the meaning and context of the asset as it is about preserving the asset itself."
- File format preservation is the process of maximising the accessibility of the file through its repeated migration to any number of more stable or current file formats.
- Data archiving is the process of storing all of the resulting digital assets for the long term, using active archiving principles and processes.
- A preservation strategy will also need to cover the people and processes you are going to use, the quality of the digital assets to preserve and the IT infrastructure and associated support.
Saturday, March 12, 2016
Demystifying Digital Preservation for the Audiovisual Archiving Community
Demystifying Digital Preservation for the Audiovisual Archiving Community. Kathryn Gronsbell, Abbey Potter. The Signal. February 22, 2016.
"The intersection of digital preservation and audiovisual archiving has reached a tipping point." Media production and use as well as the preservation strategies, including improvements in digital capture technology, adoption of file-based production workflows, digital distribution technology. storage solutions, over the past decade we have witnessed a series of transformations that fundamentally alter dominant theories and practices of moving image preservation and access. The acceptance of digital preservation has been slower in the moving image archiving and preservation community than in other fields. Rarely are the challenges of preserving audiovisual materials discussed. Recent proposals for audiovisual preservation include:
The AMIA organization hopes to bring together those who have limited resources or haven’t started strategically thinking about digital preservation, a place where the A/V community can learn without feeling lost in a wave of information. Hopefully this will increase the visibility of the intersection between audiovisual preservation and digital preservation continue the conversation between these two fields.
"The intersection of digital preservation and audiovisual archiving has reached a tipping point." Media production and use as well as the preservation strategies, including improvements in digital capture technology, adoption of file-based production workflows, digital distribution technology. storage solutions, over the past decade we have witnessed a series of transformations that fundamentally alter dominant theories and practices of moving image preservation and access. The acceptance of digital preservation has been slower in the moving image archiving and preservation community than in other fields. Rarely are the challenges of preserving audiovisual materials discussed. Recent proposals for audiovisual preservation include:
- Transition to a stream-based preservation model
- Digital preservation in practice (strategies)
- Discussions on how to preserve (innovation and practical engagement)
- The necessity of multi-disciplinary input for preservation
- Transitioning from a short-term digital preservation project to a long-term program (sustainability)
The AMIA organization hopes to bring together those who have limited resources or haven’t started strategically thinking about digital preservation, a place where the A/V community can learn without feeling lost in a wave of information. Hopefully this will increase the visibility of the intersection between audiovisual preservation and digital preservation continue the conversation between these two fields.
Subscribe to:
Posts (Atom)