Texas Digital Library Digital Preservation Services. Press release. Texas Digital Library, 5 March 2019. [PDF]
The organization now offers Digital Preservation Services to its members to help Texas cultural heritage and scholarship stewards provide access for the long term through direct consulting, training, and workflow support that includes the right combination of technologies for your unique content needs. The content can be stored in multiple geographically-dispersed locations with fixity checking with Chronopolis and Amazon through the DuraCloud interface.
This blog contains information related to digital preservation, long term access, digital archiving, digital curation, institutional repositories, and digital or electronic records management. These are my notes on what I have read or been working on. I enjoyed learning about Digital Preservation but have since retired and I am no longer updating the blog.
Showing posts with label Amazon Storage. Show all posts
Showing posts with label Amazon Storage. Show all posts
Wednesday, March 06, 2019
Friday, September 02, 2016
TRAC Certified Long-term Digital Preservation: DuraCloud and Chronopolis for Institutional Treasures
TRAC Certified Long-term Digital Preservation: DuraCloud and Chronopolis for Institutional Treasures. Website. 1 September 2016.
"An institution’s identity is often formed by what it saves for current and future access. Digital collections curated by the academy can include research data, images, texts, reports, artworks, books, and historic documents help define an academic institution’s identity."
DuraSpace and the Chronopolis service at the University of California at San Diego’s announce the DuraCloud Enterprise Chronopolis subscription plan for digital preservation. It stores digital content in Amazon and in the Chronopolis network. It provides geographic replication and synchronization of content between three storage locations, and has content integrity monitoring in a dark storage option. Plan options are a combination of Amazon S3, Amazon Glacier, and SDSC.
Pricing and Plan details
DuraCloud Preservation Subscription Fee: $1,175 Storage: $700/TB
DuraCloud Preservation Plus Subscription Fee: $1,175 Storage: $825/TB
DuraCloud Enterprise Subscription Fee: $5,250 Storage: $500/TB
DuraCloud Enterprise Plus Subscription Fee: $5,250 Storage: $625/TB
DuraCloud Enterprise Plus Subscription Fee: $5,550 Storage: $1,200/TB (Option 2)
DuraCloud Enterprise Chronopolis Subscription Fee: $2,750 Storage: $500/TB (Ingest and retrieval fees extra)
"An institution’s identity is often formed by what it saves for current and future access. Digital collections curated by the academy can include research data, images, texts, reports, artworks, books, and historic documents help define an academic institution’s identity."
DuraSpace and the Chronopolis service at the University of California at San Diego’s announce the DuraCloud Enterprise Chronopolis subscription plan for digital preservation. It stores digital content in Amazon and in the Chronopolis network. It provides geographic replication and synchronization of content between three storage locations, and has content integrity monitoring in a dark storage option. Plan options are a combination of Amazon S3, Amazon Glacier, and SDSC.
Pricing and Plan details
DuraCloud Preservation Subscription Fee: $1,175 Storage: $700/TB
DuraCloud Preservation Plus Subscription Fee: $1,175 Storage: $825/TB
DuraCloud Enterprise Subscription Fee: $5,250 Storage: $500/TB
DuraCloud Enterprise Plus Subscription Fee: $5,250 Storage: $625/TB
DuraCloud Enterprise Plus Subscription Fee: $5,550 Storage: $1,200/TB (Option 2)
DuraCloud Enterprise Chronopolis Subscription Fee: $2,750 Storage: $500/TB (Ingest and retrieval fees extra)
Monday, November 16, 2015
Background reading list for Designing Storage Architectures for Digital Collections
Background reading list. Designing Storage Architectures for Digital Collections. Library of Congress. September 9, 2015.
A list of items that may be representative of materials and projects related to the meeting topics. They might be useful to provide context for the meeting topics:
A list of items that may be representative of materials and projects related to the meeting topics. They might be useful to provide context for the meeting topics:
- Barr, J. (2015). CreateWrite-Once-Read-Many Archive Storage with Amazon Glacier.
- Degioanni, L. (2015). Visualizing AWS Storage with Real-Time Latency Spectrograms.
- Google Cloud Storage pricing.
- Google Memorial (abandoned Google projects.)
- Meza, J.; Wu, Q.; Kumar, S.; Mutlu, O. (2015). ALarge-Scale Study of Flash Memory Failures in the Field. Carnegie Mellon University.
- Rosenthal, D. (2015). EE380talk on eBay storage.
- Shacklett, M. (2015). BreakingNew Ground: 10 Key Improvements in EnterpriseStorage.
- Turney, D. (2015). To Flash orNot to Flash in the DataCentre.
- The Tail at Scale: AchievingRapid Response Times in Large Online Services (Dean) - RICON West 2013. (2013). Video.
Thursday, September 17, 2015
AWS Storage Update: New Lower Cost S3 Storage Option & Glacier Price Reduction
AWS Storage Update: New Lower Cost S3 Storage Option & Glacier Price Reduction. Jeff Barr. Amazon. 16 September 2015.
Changes in Amazon pricing and storage options: Amazon is adding a new storage class for data that is accessed infrequently, the S3 Standard – Infrequent Access, along with Standard and Glacier. This has all of the existing S3 security and access management, data life-cycle policies, cross-region replication, and event notifications features.
Prices for Standard – IA start at $0.0125 / gigabyte / month with a 30 day minimum storage duration and a $0.01 / gigabyte charge for retrieval plus transfer and request charges). Objects smaller than 128 kilobytes are charged for 128 kilobytes of storage. Data life-cycle policies can be defined to move data between Amazon S3 storage classes over time.
Also, the price of Amazon Glacier storage has decreased by up to 36%, based on the region, which is available for as little as $0.007/GB per month.
Changes in Amazon pricing and storage options: Amazon is adding a new storage class for data that is accessed infrequently, the S3 Standard – Infrequent Access, along with Standard and Glacier. This has all of the existing S3 security and access management, data life-cycle policies, cross-region replication, and event notifications features.
Prices for Standard – IA start at $0.0125 / gigabyte / month with a 30 day minimum storage duration and a $0.01 / gigabyte charge for retrieval plus transfer and request charges). Objects smaller than 128 kilobytes are charged for 128 kilobytes of storage. Data life-cycle policies can be defined to move data between Amazon S3 storage classes over time.
Also, the price of Amazon Glacier storage has decreased by up to 36%, based on the region, which is available for as little as $0.007/GB per month.
Monday, June 08, 2015
New Sources and Storage Options For Rosetta
Rosetta Users Group 2015: New Sources and Storage Options For Rosetta. Chris Erickson. June 3, 2015. [PDF slides]
This is my presentation at the Rosetta's User Group / Advisory Group held this past week.
We installed Rosetta in March 2012 and have ingested a number of collections in the preservation repository. In addition to those sources and collections we have already set up to work with Rosetta, we have been working with some new areas. These include:
One of the things that everyone is struggling with is the lack of sufficient storage. Conventional storage is expensive and limited. So we are investigating alternative long term storage possibilities:
What we are looking for in preservation storage:
This is my presentation at the Rosetta's User Group / Advisory Group held this past week.
We installed Rosetta in March 2012 and have ingested a number of collections in the preservation repository. In addition to those sources and collections we have already set up to work with Rosetta, we have been working with some new areas. These include:
- University Academic electronic records in SharePoint. Rosetta harvest tool for SharePoint
- New Library repositories, such as Digital Commons
- Harvest Canon camera raw images and ingest into Rosetta
- Ingest University videos and digitized files from the Audio Digitization project
- Program to gather information from Unstructured folders of archival objects and ingest into Rosetta
One of the things that everyone is struggling with is the lack of sufficient storage. Conventional storage is expensive and limited. So we are investigating alternative long term storage possibilities:
- Develop a Proof of Concept Project to ingest Rosetta content in DPN
- Amazon S3 Cloud Storage from Rosetta. Easy to connect to Rosetta.
- Hitachi - LG Data Storage (HLDS) Optical Archive System
- Long Term storage system
- Lowest storage cost
- Single rack unit holds 1 PB of permanent storage; unlimited expansion
- Currently testing with our Rosetta system
- Reduce the need for refreshing or migrating content
- Plan to use with Millenniata M-Discs in the archive
What we are looking for in preservation storage:
- Sufficient capacity for our ever increasing content
- Reasonable long term cost
- Lower total costs of ownership
- Reduced cost of refreshing or migration
- Reliable and recoverable
- Archival media
- Industry Storage Partner
- Multiple copies, locations
- Secure storage
- Accessible
- Rosetta
- Network
Thursday, February 19, 2015
ArchivesDirect hosted service
ArchivesDirect website. February 18, 2015.
ArchivesDirect is a web based hosted service of Archivematica offered by DuraSpace for creating OAIS-based digital preservation workflows with content packages that are archived with DuraCloud and Amazon Glacier. It includes open source preservation tools, and generates archival packets using microservices, PREMIS, and mets xml files. ArchivesDirect is intended for small to mid sized institutions. Duraspace is a partnership with DSpace, Fedora, and Vivo.
Pricing and subscription plans include:
ArchivesDirect Standard (System, training, 1 TB): $11,900
ArchivesDirect Digital Preservation Assessment: $4,500
Additional Storage in Amazon S3 and Glacier: $1,000/TB/year
ArchivesDirect is a web based hosted service of Archivematica offered by DuraSpace for creating OAIS-based digital preservation workflows with content packages that are archived with DuraCloud and Amazon Glacier. It includes open source preservation tools, and generates archival packets using microservices, PREMIS, and mets xml files. ArchivesDirect is intended for small to mid sized institutions. Duraspace is a partnership with DSpace, Fedora, and Vivo.
Pricing and subscription plans include:
ArchivesDirect Standard (System, training, 1 TB): $11,900
ArchivesDirect Digital Preservation Assessment: $4,500
Additional Storage in Amazon S3 and Glacier: $1,000/TB/year
Wednesday, February 18, 2015
Rosetta and Amazon Storage
Rosetta and Amazon Storage. Chris Erickson. February 2015.
In the search for more file storage, as well as more affordable file storage, we tried Amazon Simple Storage Service (Amazon S3). The plan was to connect the Rosetta Digital Preservation System to the Amazon cloud storage, and evaluate it as a possible storage solution. There is a free trial. The Free Tier includes 5GB storage, 20,000 Get Requests, and 2,000 Put Request.
Setup:
I tried various configurations, but decided on a single bucket for the files. I setup buckets for the IEs and metadata, but after trying it, decided to only keep the files on Amazon. That would keep the metadata local. I had tried nested folders, but couldn't figure out how to designate that in the storage rules and definition. So I create the folders by time period.
In the Rosetta Admin interface I create a File storage group, using the S3 storage plugin, and then entered the Bucket name, Secret Access Key, Access Key ID, and left the Maximum waiting time at the default. For the test, I set up a retention code for Amazon, and the storage rule used that code to determine what went to the Amazon storage. In a real storage instance, it would be better to use something that would not change, like the producer, etc.
It took a few tests to get everything in sync. The result was that Rosetta stored the content in Amazon just fine. I also tried adding content with a one day retention period, and the content was removed from Amazon after the day. A fixity check task was also able to work without a problem.
This gives us another storage option, though we decided to not use it at present.
Pricing at the time of this comparison, was:
More storage options will be considered.
In the search for more file storage, as well as more affordable file storage, we tried Amazon Simple Storage Service (Amazon S3). The plan was to connect the Rosetta Digital Preservation System to the Amazon cloud storage, and evaluate it as a possible storage solution. There is a free trial. The Free Tier includes 5GB storage, 20,000 Get Requests, and 2,000 Put Request.
Setup:
I tried various configurations, but decided on a single bucket for the files. I setup buckets for the IEs and metadata, but after trying it, decided to only keep the files on Amazon. That would keep the metadata local. I had tried nested folders, but couldn't figure out how to designate that in the storage rules and definition. So I create the folders by time period.
In the Rosetta Admin interface I create a File storage group, using the S3 storage plugin, and then entered the Bucket name, Secret Access Key, Access Key ID, and left the Maximum waiting time at the default. For the test, I set up a retention code for Amazon, and the storage rule used that code to determine what went to the Amazon storage. In a real storage instance, it would be better to use something that would not change, like the producer, etc.
It took a few tests to get everything in sync. The result was that Rosetta stored the content in Amazon just fine. I also tried adding content with a one day retention period, and the content was removed from Amazon after the day. A fixity check task was also able to work without a problem.
This gives us another storage option, though we decided to not use it at present.
Pricing at the time of this comparison, was:
1 TB | 50 TBs | |||||
Digital Storage Costs | Annual Cost | 20 Year Projected | Yearly Charge | 10 Year Projected | 20 Year Projected | 50 Year Projected |
Cloud Storage | ||||||
Amazon S3 - Regular | $360 | $7,200 | $17,706 | $177,060 | $354,120 | $885,300 |
Amazon S3 - Copy / Glacier | $480 | $9,600 | $23,706 | $237,060 | $474,120 | $1,185,300 |
Amazon S3 - Reduced & Glacier | $288 | $5,760 | $14,165 | $141,648 | $283,296 | $708,240 |
DuraSpace - Preservation | $1,800 | $36,000 | $36,100 | $361,000 | $722,000 | $1,805,000 |
DuraSpace - Dark copy/Glacier | $1,925 | $38,500 | $42,350 | $423,500 | $847,000 | $2,117,500 |
DuraSpace - Enterprise Plus | $5,625 | $112,500 | $64,425 | $644,250 | $1,288,500 | $3,221,250 |
More storage options will be considered.
Subscribe to:
Posts (Atom)