Showing posts with label bagit. Show all posts
Showing posts with label bagit. Show all posts

Thursday, August 17, 2017

DPN: Metadata Considerations for Deposits

Metadata Considerations for Deposits. DPN. August 2017.
     The Digital Preservation Network working groups have provided an overview of the types of metadata to consider while preparing deposits for DPN. Several areas are addressed:
  1. DPN-specific metadata, especially DPN-specific metadata, DPN’s BagIt specification, Tag Directories and Bag Structure.
  2. DuraCloud-specific metadata, while they do not restrict metadata they "indicate that local policies should be used to define metadata approaches".  Each snapshot contains four DuraCloud-created files: checksums (md5, sha265), a content properties file, and a collection-snapshot file  
  3. Core descriptive metadata records. The DPN Preservation Metadata Standards Working Group examined minimal metadata records from a variety of member institutions to find common metadata schemas. This resulted in  a “core record,” or the "minimum level of information needed in order to understand digital assets at a later date," shown in a clear chart.
  4. Significant properties of content. "In order for digital files to be usable and accessible in the long-term, it is important to recognize the importance of significant properties and to ensure that the properties of your digital materials are being documented in some form." They list content types, with examples of common significant properties. 

Saturday, January 23, 2016

Exactly: A New Tool for Digital File Acquisitions

Exactly: A New Tool for Digital File Acquisitions. AVPreserve News. January 13, 2016.
     A new tool, Exactly, has been developed to help to acquire born digital content from donors and to start establishing provenance and fixity early in the acquisition process. The tool:
  • can remotely and safely transfer any born-digital material from a sender to a recipient 
  • uses the BagIt File Packaging Format
  • supports FTP transfer, network transfers
  • can be integrated into sharing workflows using Dropbox or Google Drive
  • metadata templates can be created for the sender to fill out before submission
  • can send email notifications with transfer data and manifests when files have been delivered 

Wednesday, April 15, 2015

Tracking Digital Collections at the Library of Congress, from Donor to Repository

Tracking Digital Collections at the Library of Congress, from Donor to Repository. Mike Ashenfelder. The Signal, Library of Congress. April 13, 2015.
An interesting look at the processing of content by the Library of Congress specialists.
When a collection is first received the contents are reviewed and if digital media devices are found, they are transferred to the digital collections registrar, who then records that the materials were received, including the collection name, collection number, a registration number and any additional notes. The following tasks are performed:
  1. Physical inventory of the storage devices (and photograph of the medium)
  2. Write protecting, documenting, and transfer of the files using the Bagit tool
    1. a directory containing the file or files (data)
    2. a checksummed manifest of the files in the bag
    3. a “bagit.txt” file
  3. The content is cataloged, described, and inventoried. 
  4. Transfer of the files to the Library’s digital repository for long-term preservation.
If there are difficulties accessing the content, other tools can be used, such as the Forensic Recovery of Evidence Device (FRED), the Forensic Toolkit, or BitCurator. The final step is to shelve the original digital hardware and software for preservation.

Researchers visiting the Library of Congress can access copies of some of the digital collections but access depends on copyright and the conditions established by the collection donor. There are also technological challenges to serving up records.  Access is currently available only onsite. Also, the Library does not have the software or drives to read every file format. Not all researchers require a perfect rendering of the original file. A lot of researchers "are just interested in the information. They don’t care what the file format is. They want the information.”  For the Library, access and appraisal of digital collections is an ongoing issue.
 

Friday, October 02, 2009

Digital Preservation Matters - 01 October 2009

Media Preservation Survey: A Report. Mike Casey. Indiana University Bloomington. 1 October 2009. [132 p. pdf]

An excellent and very detailed report from Indiana University Bloomington concerning the 560,000 audio and video recordings and reels of film on the campus. The report looks at the characteristics and condition of only one of the many groups of materials and the preservation challenges. This was a ten-month study by a team of archivists. The next step is developing a campuswide preservation plan. These historical "jewels" will be lost if not preserved soon. A few preservation activities exist on campus but they are too small to be effective or are not sustainable. They have 51 different media formats. They have over 180,000 digital files in the collection. "These formats require active preservation services from the moment of creation if their content is to survive."

Redundancy is a key strategy is saving the materials, but only 11% have a copy. "One copy is no copy." Preservation of audio and video objects require transferring to digital. Storing AV materials at the correct temperature and relative humidity is the "single most important factor in slowing the physical degradation of audiovisual media." At the current rate it will take 120 years to digitize the AV holdings. There is a very useful chart of Selected High and Medium Risk Formats in their collection.

Among their recommendations:

  • Appoint a campuswide taskforce to advise on preservation
  • Create a centralized media preservation and digitization center for the entire campus
  • Develop special funding to digitization the materials quickly
  • Create an appropriate and centralized physical storage space for the materials
  • Provide archival appraisal and control across campus
  • Develop cataloging services to accelerate research opportunities and improve access
  • Completion of a digital preservation repository


Bagit: Transferring Content for Digital Preservation. Library of Congress. September 29, 2009.

Short video on YouTube about BagIt, a tool from The Library of Congress, California Digital Library and Stanford University. They have developed guidelines for creating, moving and verifying standardized digital containers, called "bags." BagIt requires a bag declaration, list of contents, and the actual content.



Archiving Is For E-discovery; Backup Is For Recovery. Mathew Lodge. The Metropolitan Corporate Counsel. September 01, 2009.

There are challenges with court requests for the discovery of information from backup tapes. If backup tapes are used for information retrieval then they are accessible for e-discovery. But they were never designed for this. Many are doing archiving, but that has different meanings to people. "Active archiving is different: it's a way of centrally managing the storage, retention and hold of information while ensuring "live" (or active) access to any item." It means to move objects to a central repository and provide access to users.



Purple Cows and Fringy Propositions. Carol Minton Morris. D-Lib Magazine. September/October 2009.

Notes from the Fringe Festival. "At every stage of the Bodleian Library's development, Oxford changed practices and policies, and improved – first analog and later digital – technologies in response to changes in the world beyond the Library. Realization is still a catalyst for change." The most useful metaphor for the repository is the internet. Institutions like Oxford create institutional repositories as components of larger library service platforms, not stand-alone silos. Clifford Lynch said we may be better with incremental, structured assessments rather than open-ended preservation commitments. We should aim at preserving digital objects "for the next 20 years with subsequent assessments instead of aiming to preserve them forever." Repositories should look at collecting works from scholars at the end of their careers and create a legacy. Repositories will move beyond educational organizations, so we should look at being involved in that.

Thursday, August 13, 2009

FW: Digital Preservation Matters - 13 August 2009

File Information Tool Set (FITS). August 6, 2009.

With the increase of digital projects that introduce new formats, it is increasingly important to have tools that deal with issues such as file format identification, validation and metadata extraction tools. FITS, developed by Harvard, acts as a wrapper for some existing tools, including JHOVE, Exiftool, the National Library of New Zealand Metadata Extractor, DROID, Ffident, and two original tools: FileInfo and XmlMetadata. The files can identify a file with a single result, or in the case of a conflict, can handle it in several ways. It is written in java and can be run from a command line or an interface. It is available for download and has a user guide.


Research Data Preservation and Access: The Views of Researchers. Neil Beagrie, et al. Ariadne. July 2009.

Data is becoming more central to interdisciplinary projects and has grown in size and complexity. This study tries to assess the feasibility and costs of developing and maintaining a shared digital research data service. It shows, with text and graphs, the disciplines where research data issues were of greatest concern, the storage features that are needed most, the retention period for data once the projects have ended, and how the data is shared. University managers have serious concerns about the cost, scalability and sustainability of purely local solutions.


Library of Congress Digital Preservation Newsletter. August 2009.

LC has developed new tools (including bagit) to transfer large quantities of digital content. BagIt, and related transfer tools, prepare to transfer data by packaging the collection in a directory with a manifest file that lists the contents. Specifications and other tools are on the tool and services page. More on this: 21st Century Shipping. D-Lib Magazine. Michael Ashenfelder. July/August 2009

The California Digital Library has opened its Web Archiving Service collections. The service was created to support the Web-at-Risk project, and is funded by the NDIIPP and the University of California.

A workshop on photometadata aimed at helping digital photographers use metadata when creating and distributing their work. The program demonstrated applications to embed metadata in photographs; it was stated that each digital photo can and should contain information about itself, its creator and its licensing conditions. Industry professionals told how metadata increased their business.


Online textbooks are gaining popularity, changing how students study. Dani Martinson. Missourian. August 6, 2009.

Online textbooks can provide additional information and resources for students, including direct links to audio and video. Digital textbooks are usually 50%cheaper than regular textbooks, though there is no buyback, and the books are often available only for a semester. Information can be updated easier and more frequently. A study found that the professors were more accepting of digital textbooks than students. They expect the demand will increase when the digital content is specifically designed for digital, rather than just a PDF version of the printed textbooks.