Showing posts with label storage. Show all posts
Showing posts with label storage. Show all posts

Wednesday, May 29, 2019

Digital Data Storage Outlook 2019

Digital Data Storage Outlook 2019. SpectraLogic, May 2019. [Download]
    The fourth annual Data Storage Outlook report from Spectra Logic looks at the management, usage and storage of data. Some notes on data:
  • A 2018 IDC report predicts that the Global Datasphere will grow to 175 zettabytes (ZB) of digital data by 2025, though this report projects that much of this data will never be stored or will be retained for only a brief time.The amount of “stored” digital data is a smaller subset.
  • While there will be great demand for storage, a lack of advances in a particular technology, such as magnetic disk, means a greater use of other storage mediums such as flash and tape.
  • Increasing scale, level of collaboration, and diverse workflows are moving users from traditional file-based storage to object / web storage. Rather than attempting to force all storage into a single model, a sensible combination of both is the best approach.
  • There is a need for project assets to be shared across a team so they can be the basis for future work. An example is video footage that needs to be used by teams of editors, visual effects, audio editing, music scoring, color grading, and more. 
  • The lifetime of raw assets is effectively forever and they may be migrated across storage technologies many times

Monday, March 11, 2019

Arctic World Archive receives more world treasures

Arctic World Archive receives more world treasures. Press release. 21. February 2019.
     Institutions and companies from around the world, including Utah Valley University, have deposited their digital content in the Arctic World Archive in Svalbard, Norway.  The Archive is a repository for world memory where the data will last for centuries.  The Archive is a collaboration between Piql, digital preservation specialists, and Store Norske Spitsbergen Kulkompani (SNSK), a state-owned Norwegian mining company based on Svalbard with vast experience and resources to build and maintain mountain vaults.

The top 10 items of cultural heritage, as nominated by the public was also stored away for the future. These items include famous religious texts, paintings, architectural designs, science breakthroughs and popular contemporary music. 


See also:

Wednesday, March 06, 2019

Texas Digital Library Digital Preservation Services

Texas Digital Library Digital Preservation Services. Press release. Texas Digital Library, 5 March 2019. [PDF]
     The organization now offers Digital Preservation Services to its members to help Texas cultural heritage and scholarship stewards provide access for the long term through direct consulting, training, and workflow support that includes the right combination of technologies for your unique content needs. The content can be stored in multiple geographically-dispersed locations with fixity checking with Chronopolis and Amazon through the DuraCloud interface.

Saturday, August 19, 2017

IBM and Sony cram up to 330 terabytes into tiny tape cartridge

IBM and Sony cram up to 330 terabytes into tiny tape cartridge. Sebastion Anthony. Ars Technica UK. August 2, 2017.
     IBM and Sony have developed a new magnetic tape system capable of storing 201 gigabits of data per square inch, or approximately 330 terabytes in a single palm-sized cartridge. To achieve this density, Sony developed a new type of tape that has a higher density of magnetic recording sites, and IBM Research developed new heads and signal processing technology to process the data from the "nanometre-long patches of magnetism". The new cartridges and tape drives, "when eventually commercialised, will be significantly more expensive because of the tape's complex manufacturing process."


Saturday, March 04, 2017

What Do IT Decision Makers Want?

What Do IT Decision Makers Want? Tom Coughlin. Forbes. March 1, 2017.
     An article that looks at a study of over 1,200 senior IT decision makers in 11 countries. Some findings

  • The vast majority of those surveyed have revised their storage strategy in the last 12 months because of frustrations with storage costs, performance, complexity and fragmentation of existing solutions. 
  • 60% say storage expenses are under increased scrutiny 
  • 95% are interested in the scalability and efficiency of software-defined storage. 
  • Digital storage is about 7% of the total IT budget.
  • Some concerns: 
    • High costs: 
      • 80% were concerned with the cost of their storage system
      • 92 % worry about managing storage costs as capacity needs grow. 
      • On average 70% of IT budgets are allocated to data storage 
    • Performance: 
      • 73% are concerned with the performance of their existing storage solution. 
    • Growing complexity and fragmentation: 
      • 71% of respondents said storage systems were complex and highly fragmented.  
  • Software-defined storage [which involves separating the storage capabilities and services from the storage hardware]  is playing significant roles in improving the utilization of storage resources and stretching storage budgets.

Tuesday, January 31, 2017

20 TB Hard Disk Drives, The Future Of HDDS

20 TB Hard Disk Drives, The Future Of HDDS. Tom Coughlin. Forbes. January 28, 2017.
     Interesting article on the status and future of hard drives. It looks at the declining market and the trends for hard disk drives over the next few years.  Overall drive shipments in 2016 dropped about 9.4%, meaning that 424 million drives were shipped in 2016. Of the total HDDs shipped in 2016:
  • Western Digital shipped 41% 
  • Seagate shipped 37%  
  • Toshiba shipped 22%.
"The long-term future of HDDs likely rests with high capacity HDDs, particularly in data centers serving cloud storage applications".  Seagate plans to ship 14 and 16 TB drives in the next 18 months, and possibly 20 TB drives in the next three years.

Thursday, October 20, 2016

Digital Storage In Space Rises Above The Cloud

Digital Storage In Space Rises Above The Cloud.  Tom Coughlin. Forbes. October 13,  2016.
     A start up company (Cloud Constellation) plans to build an array of earth orbiting data center satellites that would provide a space-based infrastructure for cloud service providers that can provide a private network with communications directly to and from the satellite network without any communication over the Internet via tight beam radio and hence no public data transmission headers. The company says that latencies will be lower than those through conventional Internet transmission.

The digital storage in these orbiting data centers will be solid-state drives and the internal temperature inside the satellites will be kept at about 70 degrees Fahrenheit. The budget to build the initial phase of this satellite network is estimated at $400 M, much less than the cost of building an equivalent terrestrial global data center network with an equivalent level of security. Data is encrypted on the way to the satellite chain, inside the satellite storage and when the data is transmitted back to earth. This should provide secure storage and transport of data without interruption or exposure to exposed networks.It could protect critical and sensitive data for potential clients, including university archives and libraries. The first phase is planned to be operational in 2018 or 2019. Soon many companies and organizations will have an option to store their data securely in outer space.

Monday, October 10, 2016

Secure cloud doesn’t always mean your stuff in it is secure too

Secure cloud doesn’t always mean your stuff in it is secure too. Gareth Corfield. The Register. 6 Oct 2016.
     Workflows are moving to the cloud and security technology is helping to build customer confidence. “Picking a secure cloud partner is not as trivial as it may seem. Don't assume that because the cloud is secure, your business within the cloud is secure."  The public cloud can provide  better security monitoring and analysis, management, redundancy and resilience. But you have to choose a secure cloud platform. Microsegmentation can help secure the platform against malware and other security threats. It helps to improve operational efficiency. The cloud provides many services, more than just storage.

Wednesday, October 05, 2016

How many copies are needed for preservation?

How many copies are needed for preservation? Chris Erickson. 4 October 2016.
     An important component for preservation is to have multiple copies. The specific questions are: how many copies, how should they be stored, and where should they be located. Many people advocate the 3-2-1 rule for digital storage: three copies, stored on two different media, and one copy located off-site, preferably in areas with different disaster threats. (NARA; Library of Congress) The NDSA levels also incorporate this rule in the storage section.

The copies we have been looking at are:
     Copy 1: Rosetta storage on spinning disk in the campus data center
     Copy 2: Tape copies of our archive in the Granite Mountain Record Vault.
                    Annual Tape archive plus incremental transactional backups
     Copy 3: Internet copy, with DPN or Amazon Glacier
     Copy 4: Access copy within Special Collections on M-Discs or our CMS
What we choose to put in DPN will affect the third copy. We need to determine if these copies are adequate, and if not, then find different storage methods that are cost effective and fit within our workflow.

Additional posts:



Thursday, September 22, 2016

Content Delivery Drives The Move To The Cloud

Content Delivery Drives The Move To The Cloud. Tom Coughlin. Forbes. Sep 13, 2016.
     The growing reliance on the Internet is also increasing cloud-based services for collaborative workflows and content delivery in the Media and Entertainment Industry. This is causing a shift from capital expenses to operating expenses for media and entertainment content storage.  Cloud storage for the media and entertainment industry is projected to grow from $2.5 billion in 2016 to over $20 billion by 2021.  Archiving and preservation is a large part of this, seen in this chart.

Friday, September 02, 2016

TRAC Certified Long-term Digital Preservation: DuraCloud and Chronopolis for Institutional Treasures

TRAC Certified Long-term Digital Preservation: DuraCloud and Chronopolis for Institutional Treasures. Website. 1 September 2016.
     "An institution’s identity is often formed by what it saves for current and future access. Digital collections curated by the academy can include research data, images, texts, reports, artworks, books, and historic documents help define an academic institution’s identity."

DuraSpace and the Chronopolis service at the University of California at San Diego’s  announce the DuraCloud Enterprise Chronopolis subscription plan for digital preservation. It stores digital content in Amazon and in the Chronopolis network. It provides geographic replication and synchronization of content between three storage locations, and has content integrity monitoring in a dark storage option. Plan options are a combination of Amazon S3, Amazon Glacier, and SDSC.

Pricing and Plan details
DuraCloud Preservation                    Subscription Fee: $1,175 Storage: $700/TB
DuraCloud Preservation Plus             Subscription Fee: $1,175 Storage: $825/TB
DuraCloud Enterprise                        Subscription Fee: $5,250 Storage: $500/TB
DuraCloud Enterprise Plus                Subscription Fee: $5,250 Storage: $625/TB
DuraCloud Enterprise Plus                Subscription Fee: $5,550 Storage: $1,200/TB (Option 2)
DuraCloud Enterprise Chronopolis    Subscription Fee: $2,750 Storage: $500/TB (Ingest and retrieval fees extra)

Thursday, May 19, 2016

One Billion Drive Hours and Counting: Q1 2016 Hard Drive Stats

One Billion Drive Hours and Counting: Q1 2016 Hard Drive Stats. Andy Klein. Backblaze. May 17, 2016.
     Backblaze reports statistics for the first quarter of 2016 on 61,590 operational hard drives used to store encrypted customer data in our data center. The hard drives in the data center, past and present, totaled over one billion hours in operation to date.The data in these hard drive reports has been collected since April 10, 2013. The website shows the statistical reports of the drive operations and failures every year since then. The report shows the drives (and drive models) by various manufacturers, the number in service, the time in service, and failure rates. The drives in the data center come from four manufacturers, most of them are from HGST and Seagate. Notes:
  • The overall Annual Failure Rate of 1.84% is the lowest quarterly number we’ve ever seen.
  • The Seagate 4TB drive leads in “hours in service” 
  • The early HGST drives, especially the 2- and 3TB drives, have lasted a long time and have provided excellent service over the past several years.
  • HGST has the most hours in service

Related posts:

IBM Scientists Achieve Storage Memory Breakthrough

IBM Scientists Achieve Storage Memory Breakthrough. Press release. 17 May 2016.
     IBM Research demonstrated reliably storing 3 bits of data per cell using phase-change memory. This technology doesn't lose data when powered off and can endure at least 10 million write cycles, compared to 3,000 write cycles for an average flash USB stick. This provides "fast and easy storage" to capture the exponential growth of data.


Monday, April 18, 2016

Calculating All that Jazz: Accurately Predicting Digital Storage Needs Utilizing Digitization Parameters for Analog Audio and Still Image Files

Calculating All that Jazz: Accurately Predicting Digital Storage Needs Utilizing Digitization Parameters for Analog Audio and Still Image Files. Krista White. ALCTS. 14 Apr 2016.

  The library science literature does not show a reliable way to calculate digital storage needs when digitizing analog materials such as documents, photographs, and sound recordings in older formats."Library professionals and library assistants who lack computer science or audiovisual training are often tasked with writing digital project proposals, grant applications or providing rationale to fund digitization projects for their institutions." Digital project managers need tools to accurately predict the amount of storage for digital objects and also estimate the starting and ongoing costs for the storage. This paper provides two formulae for calculating digital storage space for uncompressed, archival master image and document files and sound files.

Estimates from earlier sources:
  • thirty megabytes of storage for every hour of compressed audio,  
  • one megabyte for a page of uncompressed, plain text (bitmap format)
  • three gigabytes for two hours of moving image media
  • 90 megabytes for uncompressed raster image files, 
  • 600 megabytes for one hour of uncompressed audio recording, 
  • “nearly a gigabyte of disk space,” for one minute of uncompressed digital video.
  • 100 gigabytes (GB) of storage for 100 hours of audio tape
  • These can be adjusted to alter both file size and quality, depending on the choice of digitization standard, the combination of variables used in a chosen standard and the quantity of digital storage required.
Some additional notes from the article:
  • As the experiments demonstrate, the formulae for still image and audio recordings are extremely accurate. They will prove invaluable to digital archivists, digital librarians and the average user in helping to plan digitization projects, as well as in evaluating hardware and software for these projects. 
  • Digital project managers armed with the still image and audio formulae will be able to calculate file sizes using different standards to determine which standard will suit the project needs. 
  • Knowing the parameters of the still image and audio formulae will allow managers to evaluate equipment on the basis of the flexibility of the software and hardware before purchase. 
  • Using the still image and audio calculation formulae in workflows will help digital project managers create more efficient project plans and tighter grant proposals. 
  • The formulae for calculating storage sizes: length of the original audio recording, sampling rate, bit depth, and number of audio channels. 
  • Formula for Calculating File Sizes of Uncompressed, Still Images:

https://journals.ala.org/lrts/article/view/5961/7582

One of the tables in the article on calculating file size and comparing to the actual size:

Saturday, April 09, 2016

A DNA-Based Archival Storage System

A DNA-Based Archival Storage System. James Bornholt, et al. ACM International Conference. April 6, 2016.
    This paper presents an architecture for a DNA-backed archival storage system. "Demand for data storage is growing exponentially, but the capacity of existing storage media is not keeping up." All data worldwide is expected to exceed 16 zettabytes in 2017. For some, using DNA as a storage medium is a possibility because it is extremely dense. Most data today is stored on magnetic and optical media, but storage durability is another critical aspect of archiving.Spinning disks are "rated for 3–5 years, and tape is rated for 10–30 years."

A DNA storage system must overcome several challenges:
  1. DNA synthesis and sequencing is far from perfect, with error rates on the order of 1% per nucleotide. Stored sequences can also degrade compromising data integrity. 
  2. Randomly accessing data in DNA-based storage results in read latency and exiting work requires the entire DNA pool be sequenced and decoded. 
  3. Current synthesis technology does not scale: data beyond the hundreds of bits therefore cannot be synthesized as a single strand of DNA. Isolating only the molecules of interest is non-trivial
The presentation authors believe DNA storage is worth serious consideration and envision it as "the very last level of a deep storage hierarchy, providing very dense and durable archival storage with access times of many hours to days." It has the potential to be the ultimate archival storage solution because it is extremely dense and durable, but it is not practical yet due to the current state of DNA synthesis and sequencing.

Thursday, March 31, 2016

Floppy disks and modern gadgets: Keeping a safe distance

Floppy disks and modern gadgets: Keeping a safe distance.  Isaiah Beard. Page2Pixel. March 25th, 2016.
     In preserving older, born-digital documents and data, a common situation is that people seek help to migrate data from old floppy disks, and sometimes they are not careful with what they put next to the disks. People who used to use floppy disks and other magnetic media understood the need to be careful about keeping them away from other things that generated magnetic fields. Data on a floppy disk is stored magnetically, which made them very sensitive to magnetic and electromagnetic fields. Today's storage media, USB drives, memory cards, optical discs and such, are not susceptible to magnetic fields. Cell phones, tablets, and other equipment often have strong magnets in them. It is "important to remind people that, should they come across an old floppy disk, and they would like to save the data, they must be careful where they place it, and what newer devices come into contact with it.  Old floppies should be kept as far away from strong magnets as possible.  And smartphones, tablets and even modern laptops shouldn’t come within 6 inches of floppies or any other magnetic media that could be easily erased."  In addition, 3.5" floppy drives are becoming harder to find.

Friday, March 11, 2016

How Digital Storage Is Changing the Way We Preserve History

How Digital Storage Is Changing the Way We Preserve History. Arielle Pardes. Vice. February 19, 2016
     Article starts with an account of a digital diary platform called Oh Life; after the site had been shut down, thousands of archives were deleted and years of personal history were gone.  Digital disappearance like this is a warning sign to historians of problems to come with recording and preserving our history in the digital age. Digital storage is fragile and the files can easily be lost or locked up in encryption. Digital technology might not be around tomorrow, and many of the information storage platforms are owned by private companies, which makes it harder for archival institutions to save them.  Abby Smith Rumsey tried to troubleshoot how to store digital materials in the long-term and discusses concerns and possible future solutions for our digital age.
  • In the digital age, there's a lot circulating in the way of information, but none of it is kept very thoroughly. 
  • Technically, we don't know how to preserve it yet. Even more than that, what do we preserve? How do we know what's valuable?
Entire digital archives can vanish if the storage platform, technology, or software disappears. Many of the websites we use are owned by private companies and individuals do not own the content. We won't know for a while if the content we have saved is valuable in the future. "The more the mind can be freed of certain types of memory tasks, the freer the mind is to engage in other activities that machines cannot do for us."

Wednesday, February 17, 2016

Eternal 5D data storage could record the history of humankind

Eternal 5D data storage could record the history of humankind. Press Release. Optoelectronics Research Centre, University of Southampton. February 16, 2016.
     Scientists have developed the recording and retrieval processes of five dimensional (5D) digital data processing with femtosecond lasers that may be capable of  storing digital data for billions of years on nanostructured glass.  "The storage allows unprecedented properties including 360 TB/disc data capacity, thermal stability up to 1,000°C and virtually unlimited lifetime at room temperature (13.8 billion years at 190°C )".  This encoding on ‘Superman memory crystal’ is in five dimensions: the size and orientation in addition to the three dimensional position of these nanostructures.

Related:

Monday, February 08, 2016

Keep Your Data Safe

Love Your Data Week: Keep Your Data Safe. Bits and Pieces.  Scott A Martin. February 8, 2016.
     The post reflects on a 2013 survey of 360 respondents:
  • 14.2% indicated that a data loss had forced them to re-collect data for a project.  
  • 17.2% indicated that they had lost a file and could not re-collect the data.
If this is indicative of  the total population of academic researchers, then there is a lot of lost research time and money due to lost data. Some simple guidelines can greatly reduce the chances of catastrophic loss if steps are included in your own research workflow:
  1. Follow the 3-2-1 rule for backing up your data: store at least 3 copies of each file (1 working copy and 2 backups), 2 different storage media and at least 1 offsite copy 
  2. Perform regular backups
  3. Test your backups periodically
  4. Consider encrypting your backups.  Just make sure that you’ve got a spare copy of your encryption password stored in a secure location!  

Saturday, February 06, 2016

MRF for large images

MRF for large images. Gary McGath. Mad File Format Science Blog. January 21, 2016.
NASA, Esri speed delivery of cloud-based imagery data. Patrick Marshall. GCN. Jan 20, 2016.
     NASA and Esri are releasing to the public a jointly developed raster file format and a compression algorithm designed to deliver large volumes of image data from cloud storage.  The format, called MRF (Meta Raster Format) together with a patented compression algorithm called LERC, can deliver online  images ten to fifteen times faster than JPEG2000. The MRF format breaks files into three parts which can be cached separately. The metadata files can be stored those locally so users can "examine data on file contents and download the data-heavy portions only when needed". This would help to minimize the number of files that are transferred. The compression allows users to get faster performance, lower storage requirements, and they estimate the cloud storage costs would be about one-third as much as traditional file-based enterprise storage. An implementation of MRF from NASA is available on GitHub and an implementation of LERC is on GitHub from Esri.