Thursday, April 30, 2015

Why Media Preservation Can’t Wait: the Gathering Storm

Why Media Preservation Can’t Wait: the Gathering Storm. Mike Casey. International Association of Sound and Audiovisual Archives, Journal. January 2015. [Slide presentation]

Media preservation has reached a crisis point for content on physical audio and video
formats. Archival media collections could soon be considered highly endangered. The US National Recording Preservation Board: “ is alarming to realize that nearly all recorded sound is in peril of disappearing or becoming inaccessible within a few generations.” There is a major risk that obsolescence will defeat the efforts of archivists. What is the problem?
  • Large numbers of analog and physical digital recordings
  • Recordings are degrading, some catastrophically
    • For some formats degradation issues are critical
    • Degradation of physical recordings must be addressed before digitization
  • Obsolete audio and video formats
    • All analog and physical digital recordings are now obsolete
    • Playback systems are failing; parts are lacking, and repairs are becoming more difficult.
    • Without functioning systems, digitizing existing recordings is not possible
    • Evolution of obsolescence:
      • End of manufacturing
      • End of availability in the commercial marketplace
      • End of bench technician expertise
      • End of bench technician tools
      • End of calibration and alignment tapes
      • End of parts and supplies
      • End of availability in the used marketplace
      • End of playback expertise
  • There is a relatively short time window to save these recordings
  • The recordings contain content with high research value
The combination of degradation and obsolescence severely undermines preservation efforts. It may still be possible in a few years to digitize audio and video, but digitizing large holdings then may not be affordable. While not every recording is an appropriate candidate for long-term preservation,  many recordings and collections do carry significant value. If these items are to survive they must be digitally preserved "within the next 15 to 20 years - before sound carrier degradation and the challenges of acquiring and maintaining playback equipment make the success of these efforts too expensive or unattainable.” Some institutions are digitizing their recordings now, realizing that they cannot afford to wait until planning is completed or everything is perfectly in place to begin work.

Wednesday, April 29, 2015

Legal Aspects for Digital Preservation Domain

Legal Aspects for Digital Preservation Domain. Barbara Kolany-Raiser, Marzieh Bakhshandeh, José Borbinha, Silviya Yankova. iPres Proceedings. 2014. 
This paper proposes a legal model for the digital preservation domain. This is intended to include different perspectives and facilitate the translation and mapping of legal information in the digital preservation area. A legal perspective is important for technology developments, and when copyright protected data has to be preserved digitally, care must be taken so that the digital preservation system processes do not violate this right. The rights holder for the data must explicitly grant use and preservation rights. "Every digital preservation activity must ensure the authenticity and legitimacy of the performed actions and processes." The paper recommends integrating legal perspectives into the digital preservation process, and it includes a conceptual map of this legal perspective describing the concepts and the relationships.

Tuesday, April 28, 2015

Database Preservation Toolkit

Database Preservation Toolkit. Website. April 2015.
The Database Preservation Toolkit uses input and output modules and allows conversion between database formats, including connection to live systems. It allows conversion of live or backed-up databases into preservation formats such as DBML, SIARD, or XML-based formats created for the purpose of database preservation.

This toolkit was part of the RODA project and now has been released as a separate project. The site includes download links and related publications and presentations.

Preserving digital records and databases

Preserving digital records and databases. Luis Faria. PASIG Presentation. March 13, 2015.
Presentation on tools and models for database preservation. Diagram of  import and export flow using db-preservation-toolkit, as well as their model and the OAIS model. Throughput of row-intensive databases is 10.000 rows/s. Use the SIARD format for preservation. The SIARD-E version is underdevelopment.

Monday, April 27, 2015

Finnish Digital Preservation Service for Cultural Heritage

Finnish Digital Preservation Service for Cultural Heritage. Mikko Tiainen. PASIG Presentation. March 12, 2015. [PDF]
Preservation aspects and focus of the Digital Preservation Service:
  • Semantic preservation:
    • Content knowledge and semantics
    • Descriptive metadata
  • Logical preservation
    • Preservation planning
    • Administrative & technical metadata
    • File formats
    • Preservation actions
  •  Bit-level preservation
    • Materials & replication management
    • Storage device
    • Storage media
They estimate the life cycle of the various components as:
  • Hardware
    • Hard disk storage: 5 years
    • Tape drives & media types: 5 years
    • Tape libraries: 10 years
  • Software
    • Commercial support at  least for 5 years
    • Open source maintained and developed until replaced 
Development of a high quality digital preservation system is a continuous process. 
  • Bit-level preservation
  • Preservation planning and development of preservation actions
  • Preserving the intelligibility
  • Distributed locations
This requires support services, maintenance of specifications, and ongoing management.

Chronopolis and DuraCloud: Doing integration right

Chronopolis and DuraCloud: Doing integration right. Bill Branan, David Minor. PASIG Presentation. March 12, 2015. [PDF]
Duracloud is a hosted digital preservation service. Chronopolis is a digital preservation storage network spanning multiple institutions and geographic regionsbased on active preservation (constant checking of items). The reasons for integrating the services and becoming a DPN node:
  • Digital content preservation is important to the future of society
  • All preserved digital content should be handled equally
  • Need an economically viable option to support the preservation needs of all institutions, regardless of size or technical capability
  • Need to simplify the preservation process as much as possible
These are two very independent existing systems with different workflows and processes. DuraCloud works with real-time data, and Chronopolis works with well defined data collections. Sometimes the best way to integrate two systems is to not require either system to know anything about the other.

Saturday, April 25, 2015

MoMA’s Digital Art Vault

MoMA’s Digital Art Vault. Ben Fino-Radin. The Museum of Modern Art. April 14, 2015.
The museum is working to preserve and digitize its 4,000 videotape collection of analog video art. It has created a digital vault which consists of:
  1. the packager: analyzes all digital collections materials as they arrive; records the results in an obsolescence-proof text AIP stored with the materials themselves, and generates a checksum.
  2. the warehouse: a digital storage RAID system maintained by their IT department. This type of disk-based storage becomes "an untenable expense" with very large amounts of data. They project 1.2PB. It would be irresponsibly expensive to continue using hard drive storage, as it was not quite intended for this scale of data.
  3. the indexer, which is not discussed at present.
There is the issue of how to ensure that our successors will understand what a given stream of bits is supposed to represent, and there is also the problem of authenticity.  These archival packages contain the digital content as well as the information needed in the future to understand what the materials are and to confirm their authenticity.

Friday, April 24, 2015

PREFORMA Starts Prototyping Phase

PREFORMA Starts Prototyping Phase. OPF Blog. 22 April 2015.
The PERFORMA prototyping phase has started with three groups that will work on:
  1. the compliance checker for the PDF/A standard for documents; 
  2. the TIFF standard for digital still images; and
  3. a set of open source standards for moving images
This phase will last until December 2016. It is important that libraries and archives understand what is in the digital objects they are preserving.  These tools will increase the knowledge about these formats.

Format Migrations at Harvard Library: An NDSR Project Update

Format Migrations at Harvard Library: An NDSR Project Update. Joey Heinen. The Signal. April 17, 2015.
Digital material is just as susceptible to obsolescence as analog formats. There are different strategies that can be implemented, and they are developing a format migration framework for migration projects at Harvard. The viability of this framework will be tested by migrating three obsolete formats within the Digital Repository Service: Kodak PhotoCD, SMIL playlists and RealAudio. A first step is determine the stakeholders and responsible parties, since a digital preservation project cannot begin without knowing the stakeholders.

A diagram of the Migration Workflow shows each step of the process from gathering documentation for initial analysis to ingest of the migrated content into the repository. A Migration Pathway diagram shows how content will be transformed by a migration. They hope that analyzing the technical and infrastructural challenges of each format and putting this into a template that can be adapted will help the digital preservation field.

Thursday, April 23, 2015

WebPreserver - Collect Web & Social Media as Legally Admissible Evidence

WebPreserver - Collect Web & Social Media as Legally Admissible Evidence. Website. Apr 08, 2015.
The software package, WebPreserver, uses a Chrome web browser plugin and web-based platform to collect authenticated snapshots of websites, blogs and social media accounts like Facebook, Twitter and Google+ easy. The screen captures, source-code and metadata are authenticated with a 256-bit digital signature and time stamp comply with the Federal Rules of Evidence and other regulatory requirements. Users can organize, tag, collaborate, search, print on demand or download the files as PDF's or Warc files.

Wednesday, April 22, 2015

Because digital preservation won’t just go away

Jisc Archivematica project update ...because digital preservation won’t just go away. Jenny Mitcham. Digital Archiving at the University of York blog. 17 April 2015.
Investigating Research Data Management; recognize that the tools used by digital archivists could have much to offer those who are charged with managing research data. The research will be based around the following questions:
  1. Why are we bothering to 'preserve' research data. What are the drivers here and what are the risks if we don't?
  2. What are the characteristics of research data?
  3. How might research data differ from other born digital data that institutions are archiving and preserving?
  4. What types of files are researchers producing?
  5. How would we incorporate the system into a wider technical infrastructure for research data management and what workflows would we put in place?
Previously they had conducted a survey that looked specifically at software packages used by researchers. They are investigating how existing digital preservation tools would handle these types of data and if these appear in Pronom.

University of Sheffield Selects Ex Libris Rosetta

University of Sheffield Selects Ex Libris Rosetta. Press Release. April 21, 2015.
The University has selected the Rosetta digital asset management and preservation solution to "ensure the sustainable preservation of the digitised objects created by the University Library’s Special Collections Department and the National Fairground Archive—a unique collection of videos, texts, audio files, and pictures about the culture and history of travelling fairs and entertainment—as well as the large collection of University’s born-digital material which includes scholarly research data, administrative records, and departmental publications."

“In line with the University’s strategy to establish an enduring digital archive, Rosetta will enable us to develop a sustainable digital preservation programme, underpinned by a full lifecycle infrastructure for the management and preservation of digital objects.”

Saturday, April 18, 2015

Digital Curation and Doctoral Research: Current Practice

Digital Curation and Doctoral Research: Current Practice. Daisy Abbott. International Journal of Digital Curation. 10 February 2015.[PDF]
More doctoral students are engaging in research data creation, processing, use, management, and preservation activities (digital duration) than ever before. Digital curation is an intrinsic part of the skills that students are expected to acquire.

Training in research skills and techniques is the key element in the development of a research student. The integration of digital curation into expected research skills is essential. Doctoral supervisors "should discuss and review research data management annually, addressing issues of the capture, management, integrity, confidentiality, security, selection, preservation and disposal, commercialization, costs, sharing and publication of research data and the production of descriptive metadata to aid discovery and re-use when relevant." Those supervisors may not necessarily have those skills themselves. And there is a gap in the literature about why and how to manage, curate, and preserve digital data as part of a PhD program.

While both doctoral students and supervisors can benefit from traditional resources on the topic, the majority of guidance on digital curation takes the form of online resources and training programs. In a survey,
  • over 50% of PhD holders consider long-term preservation to be extremely important. 
  • under 40% of students consider long-term preservation to be extremely important.
  • 90% of doctoral students and supervisors consider digital curation to be moderately to extremely important. 
  • Yet 74% of respondents stated that they had limited or no skills in digital curation and only 10% stated that they were “fairly skilled” or “expert”. 
And generally researchers were not are of the digital curation support services that are available. The relatively recent emphasis on digital curation in research nature of or the processes, present problems for supervisors. Developing the appropriate skills and knowledge to create, access, use, manage, store and preserve data should therefore be considered an important part of any researcher’s development. Efforts should be taken to
  • Ensure practical digital curation is understood
  • Encourage responsibility for digital curation activities in institutional support structures
  • Increase the discoverability and availability of digital curation support services

Tech talk in the archives: how can we redefine our processes & priorities in the digital age?

Tech talk in the archives: how can we redefine our processes and priorities  in  the  digital  age?  Erin  O’Meara. PASIG Presentation. March 12, 2015. [PDF]
Tech talk for archives usually revolves around workflow, data clean up and translation, infrastructure needs, and digital content pathways. Traditional archives have a physical focus, work is long term and ongoing, and priorities are established based on perceived use. For digital archives, the focus is on the digital objects, which tends to be more of a project based approach that needs dependency analysis before work is done. Prioritization is based on a grid of preservation needs, use and access, and an impact on the larger repository.

Get to know your complete digital holdings (servers as well as boxes of disks); the infrastructure, your capabilities for acquiring, processing and preserving digital holdings; staff and their training needs; and the gaps in all of this. Your processes need to change from "as-is" to "to-be".
  • Gain momentum and resources "building the ship while flying"
  • Geta business analysis and outsider perspective
  • Manage processes and know when to change
  • Continue building, and allow the archive to grow and mature
  • Allow time to review and reflect
  • Clarify and integrate roles between archive and tech staff

In determining your directions:
  • State the  larger goal then document  it
  • Break it down into executable chunks
  • Engage and educate stakeholders and leadership
  • Talk to colleagues outside your workplace
  • Curiosity and tinkering  encouraged
  • Give your team a break and recognition

Friday, April 17, 2015

Trustworthiness of Preservation Systems

Trustworthiness  of  Preservation Systems. David  Minor. PASIG Presentation. March 11, 2015. [PDF]
We  all  want  to  trust  systems, especially preservation  systems. Trust is an iterative process to verify and clarify. The principles of trust include:
  •  Institutional commitment to collections
  •  Infrastructure demands
  •  Technical system and staffing capabilities
  •  Sustainability (particularly funding, technology, collaboration)
  •  Identify and communicate risks to content, examining “what if” questions

There are three levels of auditing
  •  "Basic certification” is a simple self assessment
  •  "Extended certification" represents a plausibility checked assessment
  •  "Formal certification" is an audit driven by external experts

Major auditing frameworks include:
  •  Data Seal of Approval (Basic)
  •  nestor (Extended)
  •  TRAC/ISO 16363 (Formal)
  •  DRAMBORA (Range)

  1.  Identify organizational context
  2.  Document policy and regulatory framework
  3.  Identify activities, assets, and their owners
  4.  Identify risks
  5.  Assess risks
  6.  Manage risks
In the future, we need to know how these audit frameworks apply to distributed digital preservation environments, and how flexible the questions and the audit models are.

Thursday, April 16, 2015

Sony and Memnon announce partnership to enhance digital preservation capabilities

Sony and Memnon announce partnership to enhance digital preservation capabilities. Press release. April 13, 2015.
The partnership is offering their technology and experience in delivering large-scale digital preservation projects involving audio, video and film content. Some of the existing customers include Danish Radio, the British Library, Bibliothèque Nationale de France and Indiana University, BBC Worldwide and Sony Pictures. The need for large-scale digital preservation in organizations have accelerated due to the continuous physical deterioration of media carriers and the need for interoperability, lower storage costs and a stable long-term digital storage format. The research suggests that only 21% of broadcasters have completed digitisation of their tape libraries, and other the others, they average more than 100,000 legacy tape. “As a result, many content owners have assets that are literally depreciating, yet simultaneously have increased opportunities for reusing and monetising their digital content, once it is made readily accessible.”

“The time to tackle this challenge is undoubtedly now, but any successful digital preservation project is reliant on proven technological and operational expertise. We believe that large-scale digitisation is a distinct discipline that requires industrial processes and methodologies for high efficiency and consistent quality.

Wednesday, April 15, 2015

Tracking Digital Collections at the Library of Congress, from Donor to Repository

Tracking Digital Collections at the Library of Congress, from Donor to Repository. Mike Ashenfelder. The Signal, Library of Congress. April 13, 2015.
An interesting look at the processing of content by the Library of Congress specialists.
When a collection is first received the contents are reviewed and if digital media devices are found, they are transferred to the digital collections registrar, who then records that the materials were received, including the collection name, collection number, a registration number and any additional notes. The following tasks are performed:
  1. Physical inventory of the storage devices (and photograph of the medium)
  2. Write protecting, documenting, and transfer of the files using the Bagit tool
    1. a directory containing the file or files (data)
    2. a checksummed manifest of the files in the bag
    3. a “bagit.txt” file
  3. The content is cataloged, described, and inventoried. 
  4. Transfer of the files to the Library’s digital repository for long-term preservation.
If there are difficulties accessing the content, other tools can be used, such as the Forensic Recovery of Evidence Device (FRED), the Forensic Toolkit, or BitCurator. The final step is to shelve the original digital hardware and software for preservation.

Researchers visiting the Library of Congress can access copies of some of the digital collections but access depends on copyright and the conditions established by the collection donor. There are also technological challenges to serving up records.  Access is currently available only onsite. Also, the Library does not have the software or drives to read every file format. Not all researchers require a perfect rendering of the original file. A lot of researchers "are just interested in the information. They don’t care what the file format is. They want the information.”  For the Library, access and appraisal of digital collections is an ongoing issue.

Tuesday, April 14, 2015

Digital curation and quality standards for memory institutions: PREFORMA research project

Digital curation and quality standards for memory institutions: PREFORMA research project. Antonella Fresa, Börje Justrell, Claudio Prandoni. Archival Science. 25 Mar 2015.  [PDF]
Memory institutions are facing increasing content for long-term preservation. The intention of PREFORMA project (PREservation FORMAts for culture information/e-archives) is to establish a long-term sustainable ecosystem around a range of practical tools with the stakeholders. According to the recent European Cultural Heritages study, 
  • 3 % of institutions studied have a written digital preservation strategy (from 44 % for national libraries to 12–25 % for museums)
  • About a third of the institutions are included in a national preservation strategy
  • 40 % of national libraries say there is no national digital preservation strategy
  • 30 % of institutions are included in a national digital preservation infrastructure
There are barriers for digitization and curation, such as: 
  • cost of digitization
  • high costs of digital preservation, due to the use of separate solutions implemented by each memory institution
  • cultural content is complex
Digital preservation is defined by the Digital Preservation Europe project as ‘‘a set of activities required to make sure digital objects can be located, rendered, used and understood in the future’’. To be used meaningfully in the future, digital objects should be preserved in a context that makes them usable and understandable for future users. The  preservation  of  digital  information  is  an  ongoing  action,  to  be periodically revised, in order to update data sets and metadata formats.

In the area of creation and appraisal of digital objects, there are three identifiable
major work areas:

  1. Standardisation of the communication between the producer and the archive;
  2. Development of tools supporting generation and transformation of metadata;
  3. Development of tools for automated or semi-automated appraisal of data.
Important aspects for digital preservation during ingest are:
  1. File format
  2. Authenticity, integrity and provenance  data 
  3. Completeness of metadata  accompanying   digital  objects  
  4. Transformation of objects that may be necessary 
The PREFORMA activities intend to allow institutions control  over  the  technical  properties  of  preservation  files through an open-source conformance checker and creating an ecosystem around the implementation for specific file formats. The first activity is to develop an open-source toolset for conformance checking of digital  files. The second activity is to establish a network of common interest in order to gain control over the technical properties of preservation files

Preservation and access to cultural heritage materials participates in movement towards ‘‘unlocking the full value of scientific data’’. 

Saturday, April 11, 2015

Digital preservation as a service

Digital preservation as a service. Steve Knight. National Library of New Zealand. March 30th, 2015.
Digital Preservation as a Service (DPaaS) is a joint project of National Library of New Zealand and Archives New Zealand to determine how to best approach digital preservation and leverage the government’s investment to date.

Digital preservation requires interaction with all the organisation’s processes and procedures and institutional support for appropriate resources. It is:
  • the active management of digital content over time to ensure ongoing access
  • a ‘series of managed activities necessary to ensure continued access to digital materials for as long as necessary’ despite ‘the obsolescence of everything’
Digital preservation is not:
  • backup and disaster recovery – these are short term business functions 
  • only about access or ‘open access’ 
  • ‘an afterthought’
We are trying to ensure against loss, against wasted time and money when systems are not built with long term needs in mind. We need a sustainable safekeeping model for digital assets – is a national level digital preservation service the answer? A nation-wide approach will:
  • ensure the long term safekeeping of a greater range of New Zealand’s social, cultural, scientific and economic digital assets
  • leverage investment to date
  • reduce duplicate investment
  • support a strategic response to issues related to data use and re-use 
By working at a national scale, we can provide the digital preservation capability and capacity that’s not currently available.

Friday, April 10, 2015

Cloud storage for preservation

Archiving On-Premise and in the Cloud. Joseph Lampitt, Oracle. PASIG Presentation. March 2015. [PDF]
Cloud Storage is storage accessed over a network via web services APIs. For digital preservation storage, one option is the 3-2-1 Rule (3 copies, 2 mediums, 1 offsite).

Benefits of Cloud Storage
  • Limitless scalability
  • Custom metadata
  • Single namespace
  • Simplified management
Preservation Considerations with Cloud Storage include:
  • System / cloud performance
  • Security
  • Infrastructure and investment
  • Stability and longevity of the solution
  • Descriptive metadata 
  • Fixity and where that happens
  • System security and access control  
  • Audit Event Tracking (e.g. maintaining records of actions associated with an asset)
  • Version control so that originals are unchanged
There are trade offs between on site and cloud solutions. The business needs should drive the choice of solutions. It is reported that 90% of an organization's data is passive. Charts comparing the cost of cloud storage to on site storage. “Glacier is almost 10 times as expensive as an on premise tape system with support.”

Thursday, April 09, 2015

Digital Preservation: We Know What it Means Today, But What Does Tomorrow Bring?

Digital Preservation: We Know What it Means Today, But What Does Tomorrow Bring? Randy Kiefer's presentation.  UKSGLive. April 3, 2015.
Long-term preservation refers to processes and procedures required to ensure content remains accessible well into the future. Publishers want to be good stewards of their content. Digital Preservation is an "insurance policy" for e-resources. Commercial hosting platforms and aggregators are not preservation archives! They can remove discontinued content from their system, and commercial businesses may disappear, as with Metapress. There are global digital preservation archives, such as CLOCKSS and Portico, and regional archives, such as National Libraries.
The biggest challenges in the future are: formats (especially presentation of content; and what to do with databases, datasets and supplementary materials. "Any format can be preserved, including video. The issue is that of space, cost and presentation (especially if the format is now not in use/supported)." There are legal issues with cloud based preservation systems. There is no legal precedent with a cloud-based preservation system, and no protection with regards to security.

Monday, April 06, 2015

NYU Libraries to Team with Internet Archive to Preserve High Quality Musical Content on the Web

NYU Libraries to Team with Internet Archive to Preserve High Quality Musical Content on the Web. Christopher James. Press Release. March 27, 2015.
This collaboration is to ensure that the websites of musical composers can be collected, preserved, and made accessible in the future. The project will preserve objects with sound and visual quality at a significantly higher level than current web archiving standards. The project is funded with a grant of from The Andrew W. Mellon Foundation. Since master-level recordings are rarely available on the  Internet, the music is usually delivered in lower quality compressed formats, such as MP3. A specific aim of the project is to develop protocols for obtaining master-level recordings and integrating them into the archival copies on the websites.

Friday, April 03, 2015

Dutch digital developments

Dutch digital developments.  Digital Preservation Seeds. March 29, 2015.
There are national strategy plans to streamline and intensify initiatives concerning the digital heritage and to focus on collaboration between all “cultural heritage organizations’ in the Netherlands. The collaboration would look at the big organisations in specific areas offering services and assistance to  their colleagues from smaller organizations. Also by having shared initiatives across types of institutions, such as museums and archives, to make collected material more visible to the public.
The goals of the working groups are:
  1. Making digital heritage visible. Identify what the public expects expects from digital heritage and how they want to use it, and how to promote the digital collections to make them more visible. 
  2. Making digital heritage usable. Look for ways to improve collections, the find-ability, and to work together with researchers to improve search facilities.
  3. Preserve digital heritage for the long term. The infrastructure for digital preservation needs to be developed and to use already existing experience and facilities.
Achieving these goals will hopefully lead to an integrated approach to improve the access and preservation of our digital heritage. 

Thursday, April 02, 2015

Preservation Policy for Humans

Preservation Policy for Humans. Nick Ruest, Stephen Marks. Presentation, PASIG 2015. March 2015. [PDF].
Some questions to think about when creating the digital preservation policy:
  1. Where should I start?
  2. What do I care about?
  • Identify the aspects that are important and prioritize them.
  • Develop a collection development policy by getting input from your community. Identify your primary and secondary communities and determine that is important to them.
  • Determine what you can preserve, identify the scope of your efforts, and what you have the right to preserve.
  • Identify the size and type of objects that your infrastructure can preserve now, and what it may be able to support in the future.
  • When preserving objects, maintain the integrity, authenticity and usability of the objects over time. It is important for preservation that your actions are consistent with your plan.
The plan should address:
  1. The general approach
  2. The tools available
  3. The methods of applying the tools
With digital preservation there are different levels that can be implemented.
"There is no single right solution."

Digitization Challenges – A Discussion in Progress

Digitization Challenges – A Discussion in Progress. Merrilee Proffitt. Blog. OCLC Research. March 23, 2015.
There are challenges faced by libraries digitizing collections, such as dealing with born digital materials, storage and preservation, web harvesting, and others. Their recent discussion looked at these topics:
  • Metadata: Item-level description vs collection descriptions. The challenge is digitizing archival collections at the item or page level when the descriptions are at a collection level. How can we engage scholars to help with the description if the resources are outside the library?
  • Process management / workflow / shift from projects to programs. There are challenges to establish workflows to meet different needs. Some are transitioning from projects to programs.
  • Selection – prioritizing users over curators and funders. "Many institutions are still operating under a model whereby curators or subject librarians feed the selection pool" even though surveys indicate selection should move towards directly serving the needs of the users.
  • Audio/Visual materials. Making these available is a concern because differing levels of interest, high costs, reformatting capacities, and need for accompanying transcriptions. 
  • Access: are we putting things where scholars can find them. Are collections discoverable from Google or the institution? What are the users' experiences in using the collection?

Wednesday, April 01, 2015

Mining the Archives: Metadata Development and Implementation

Mining the Archives: Metadata Development and Implementation. Martin White. Ariadne.
13 February 2015.
This is review of articles in the Ariadne archives on metadata. Michael Day, Metadata Officer at UKOLN, contributed a short paper to Ariadne on the implications of metadata for digital preservation. He set out five important questions which still represent challenges for the profession:
  • Who will define what preservation metadata are needed?
  • Who will decide what needs to be preserved?
  • Who will archive the preserved information?
  • Who will create the metadata?
  • Who will pay for it?
The challenges of metadata development and implementation are substantial. A paper “Application Profiles: Mixing and Matching Metadata Schemas” talks about the roles of those making the metadata standards and those using them:
Both sets of people are intent on describing resources in order to manipulate them in some way. Standard makers are concerned to agree a common approach to ensure inter-working systems and economies of scale. However implementors, although they may want to use standards in part, in addition will want to describe specific aspects of a resource in a “special” way. Although the separation between those involved in standards making and implementation may be considered a false dichotomy, as many individuals involved in the metadata world take part in both activities, it is useful to distinguish the different priorities inherent in the two activities.