Thursday, December 15, 2016

DPN and uploading to DuraCloud Spaces

DPN and uploading to DuraCloud Spaces. Chris Erickson. December 15, 2016.
     For the past while we have been uploading preservation content into DuraCloud as the portal to DPN. DuraCloud can upload files by drag-and-drop but a better way is with the DuraCloud Sync Tool. (The wiki had helpful information in setting this up). This sync tool can copy files from any number of local folders to a DuraCloud Sspace, and can add, update, and delete files. I preferred the GUI version in one browser window and the DuraCloud Account in another.

We have been reviewing all of our long term collections and assigning Preservation Priorities, Preservation Levels, and also the number of Preservation Copies. From all this we decided on three collections to add to DPN, and created a Space (which goes into an Amazon bucket) for each. The Space will then be processed into DPN:
  1. Our institutional repository, including ETDs which are now digitally born, and research information. From our ScholarsArchive repository
  2. Historic images that have been scanned; the original content is either fragile or not available. Exported from Rosetta Digital Archive.
  3. University audio files; the original content was converted from media that is at risk. Some from hard drives, others exported from Rosetta Digital Archive.
Some of the files were already in our Rosetta preservation archive, and some were in processing folders ready to be added to Rosetta. They all had metadata files with them. The sync tool worked well for uploading these collections by configuring the source location as the Rosetta folders and target was the corresponding DuraCloud Space. Initially, the uploading was extremely slow, several days to load 200 GB. But DuraCloud support provided a newer, faster version of the sync tool, and we changed to a faster connection. The upload threads changed from 5 to 26 and we uploaded the next TB in about a day.

We also had a very informative meeting with DPN and the two other universities in Utah that are DPN members, where Mary and Dave told us that the price per TB was now half the original cost. Also, that unused space could be carried over to the next year. This will be helpful in planning additional content to add. Instead of replicating our entire archive in DPN, we currently have a hierarchical approach, based on the number and location of copies, along with the priorities and preservation levels.

Related posts:

No comments: