Tuesday, November 20, 2018

Audiovisual Metadata Platform Planning Project: Progress Report and Next Steps

Audiovisual Metadata Platform (AMP) Planning Project: Progress Report and Next Steps. Jon W. Dunn, et al. Indiana University. March 28, 2018.
     This is a report of a workshop which was part of a planning project for design and development of an audiovisual metadata platform. "The platform will perform mass description of audiovisual content utilizing automated mechanisms linked together with human labor in a recursive and reflexive workflow to generate and manage metadata at scale for libraries and archives." 

Libraries and archives hold massive collections of audiovisual recordings from a diverse range of timeframes, cultures, and contexts that are of great interest across many disciplines and communities. Galleries, Libraries, Archives, and Museums (GLAM) face difficulty in creating access to their audiovisual collections, due to high costs, difficulty in managing the objects, and the lack of sufficiently granular metadata for audio/video content to support discovery, identification, and use. Text materials can use full-text indexing to provide some degree of discovery, but "without metadata detailing the content of the dynamic files, audiovisual materials cannot be located, used, and ultimately, understood".  Metadata generation for audiovisual recordings rely almost entirely on manual description performed by experts in a variety of ways. The AMP will need to process audio and video files to extract metadata, and also accept / incorporate metadata from supplementary documents.  One major challenge is processing and moving large files around, both in terms of time and bandwidth costs.

The report goes into depth on the AMP business requirements, some of which are:
  • Automate analysis of audiovisual content and human-generated metadata in a variety of formats to efficiently generate a rich set of searchable, textual attributes
  • Offer streamlined metadata creation by leveraging multiple, integrated, best-of-breed software tools in a single workflow
  • Produce and format metadata with minimal errors 
  • Build a community of developers in the cultural heritage community who can develop and support AMP on an ongoing basis 
  • Scale to efficiently process multi-terabyte batches of content 
  • Support collaborative efforts with similar initiatives
The following formats are possible sources for AMP processing:
  • Audio (.mp3, .wav) 
  • Image (.eps, .jpg, .pdf, .png, .tif) 
  • Data (.xlsx, .csv, .ttl, .json) 
  • Presentation (.key, .pptx) 
  • Video (.mov, .mp4, .mkv, .mts, .mxf) 
  • Structured text (.xml, with or without defined schemas, such as TEI, MODS, EAD, MARCXML) 
  • Unstructured text (.txt, .docx)
The report continues by looking at the Proposed System Architecture, functional requirements, and workflows.
Outcome: "The AMP workshop successfully gathered together a group of experts to talk about what would be needed to perform mass description of audiovisual content utilizing automated mechanisms linked together with human labor in a recursive and reflexive workflow to generate and manage metadata at scale for libraries and archives. The workshop generated technical details regarding the software and computational components needed and ideas for tools to use and workflows to implement to make this platform a reality."

No comments: