Tuesday, December 29, 2015

Storage For The Next 5,000 Years

Storage For The Next 5,000 Years. Tom Coughlin. Forbes. Dec 15, 2015.
     We are creating as much information annually as mankind generated from the beginning of civilization to a few years ago. Some of the data is temporary while other data has longer-term value and may be useful in the future. As we generate and save more data the question is whether we can actually keep the data the long term with hardware or format obsolescence. "But even if data is transferred from older formats/media to modern formats regularly natural processes driven mostly by thermal energy can destroy data over time. The longer the data is kept the greater the chance of data corruption". 

Keeping data for a long time can be expensive and requires management and multiple copies of data on different hardware. While large organizations with valuable content can afford to protect their data, smaller organizations or consumers will find it difficult, though one way may be to move the data to managed cloud storage data centers where it can be managed by professionals. "Carrying data into the far future will require careful management of data to support multiple copies and continuous detection and elimination of data corruption". On-line archives may be able to provide access and archived data.

Wednesday, December 23, 2015

Personal Digital Archiving

Personal Digital Archiving.  Gabriela Redwine. DPC Technology Watch Report 15-01. December 2015. [PDF]
     This excellent report looks at some of the  key challenges people face in managing and storing their digital files. It "stresses the importance of preserving personal files" that include physical, digitized and born-digital materials. The term ‘personal digital archiving’ or ‘Save your digital stuff!’ refers to how people keep track of their digital files, where they store them, and how the files are described and organized.

The report reviews the archiving issues and offers guidance and resources to help individuals be proactive and save their digital content. It also argues for the "importance and urgency of preserving personal files while also acknowledging the difficulty of managing digital media and files". Personal items increasingly exist only in digital format. "This brings a new understanding of what letters, photos and other sources look like in the digital age, and raises important questions about how to manage these personal items today and how to preserve them for future generations."

"Thinking of a personal collection of digital files as ‘archives’ places emphasis on the larger context within which those digital files exist. The records of people’s lives are intrinsically important and worth preserving." Social media archiving necessarily requires a considerable investment of resources so it is important to choose which social media services should be archived.  Some key threats to a personal digital archive:
  • old hardware and software
  • lack of secure storage and backup 
  • natural and man-made disasters 
  • neglect of content
  • loss of cloud-based host or service provider
  • lack of planning
  • death of an individual
The report lists recommendations (quick wins, more effort, maximum effort) for the threats listed. Some of these are:

Recommendations: addressing key threats to personal digital files
  • Choose software that is well supported and creates files that can be read by a variety of different programs.
  • Develop file naming conventions that are easy to remember and apply these consistently.
  • Create multiple back-up copies and store them in different geographical locations.
  • Test your back-up copies to make sure they are accessible and contain what you intend them to.
  • Transfer files to new media every 2 to 4 years.

Recommendations: taking good care of a personal digital archive
  • Choose high-quality storage media and refresh it regularly.
  • Be proactive about refreshing storage media, replacing outdated equipment before it
  • fails, and not relying exclusively on one service provider or storage solution.
  • Follow best practice when naming files.

With digital preservation, and especially with creating and maintaining a personal digital archive the hardest part is deciding how to start. Start first by making a back-up copy of your files, then address questions such as these:
  • Which files would you miss most if they suddenly disappeared?
  • What qualities about those files are most important – for example, does it matter if the formatting of a word-processing document changes if the text is still readable?
  • Do your digital photos include important descriptive or contextual information that you need to use a particular program to see?

Monday, December 21, 2015

OhioLINK Adopts Ex Libris Rosetta for Digital Preservation

OhioLINK Adopts Ex Libris Rosetta for Digital Preservation. Ex Libris. Press release. December 21, 2015.
     OhioLINK has selected the Ex Libris Rosetta digital management and preservation solution for 120 academic libraries plus the State Library of Ohio. Rosetta will ensure long-term access to the OhioLINK Electronic Journal Collection (EJC), Electronic Book Collection (EBC), Electronic Theses and Dissertations (ETD) Center, and Digital Resource Commons (DRC) collections. - OhioLINK sought a preservation system based on the Open Archival Information System (OAIS) reference model that could integrate with its existing content management systems and support a wide range of processing workflows. As a large and complex consortium, OhioLINK required a solution that could be implemented and maintained in a way that suits a wide variety of content.

Friday, December 18, 2015

Digital preservation in 2016: 5 predictions

Digital preservation in 2016: 5 predictions. Jon Tilbury. ItProPortal. December 15, 2015.
     The article presents five trends that he sees in digital preservation from his point of view:
  1. Old analog media and file formats will continue to become obsolete. Betamax, "think of the floppy disk, the CD-ROM, Lotus 1-2-3 or WordStar." Digitize content to digitally preserve content against obsolescence.
  2. Moving critical long-term and permanent digital records to the safety of secure and open archival repositories, where records can be "future-proofed for the long-term".
  3.  Digital preservation will go mainstream. Cultural organizations have been managing and preserving digital content; now many commercial and government organizations are now understanding the need for long-term digital preservation.
  4. Use of the cloud for preserving digital content will continue to increase.
  5. Technology refresh cycles will get faster. The Digital Dark Age debate has helped to move digital preservation to a higher level.

Thursday, December 17, 2015

The Future of the Humanities in a Digital Age

The Future of the Humanities in a Digital Age. SDSU News. December 15, 2015.
    In a preview of  a January lecture, Vint Cerf was asked about his comment of a "digital dark age" in that storage formats could become incompatible with future hardware technologies. His response  was "I am deeply concerned that people take "digital preservation" to mean digitizing fixed text and imagery. What I worry about is that this format will prove to be unreliable if the software that interprets it is no longer available. We really need to figure out how to assure that digitized content can be preserved regardless of format."

Wednesday, December 16, 2015

5 Open Source Digital Preservation Tools to Assist Enterprise Archiving

5 Open Source Digital Preservation Tools to Assist Enterprise Archiving. Christopher J. Michael. Paragon Solutions. December 15, 2015.
     General article about digital preservation and some useful tools. "Digital archiving and preservation are needed to ensure the authenticity, integrity, and protection of electronic records despite limited resources and a constant stream of new complex technologies. "
  • "Digital preservation is the foundation of enterprise archiving."
  • "Electronic records are archived when they have long-term retention needs in order to fulfil legal, business and regulatory requirements."
  • A digital archive is a repository to store collections of digital objects to provide long-term access to the information.
There are some useful tools to help with the challenges of archiving and obsolescence:
  1. Matchbox: software to identify duplicate images.
  2. DROID: identify and standardize file formats and metadata extraction.
  3. Xena (XML Electronic Normalising for Archives): detect the file formats of objects and convert them into into open formats.
  4. ePADD: supports the appraisal, ingest, processing, discovery, and delivery of email archives.
  5.  Web Curator Tool: a tool for harvesting websites for archiving with descriptive metadata.
A clearly documented digital preservation policy that includes standard file formats and that is followed consistently will help ensure that objects in the archive will be available long term.

Tuesday, December 15, 2015

Building a Digital Preservation Strategy

Building a Digital Preservation Strategy. Edward Pinsent. DART Blog, University of London Computer Centre. 23 November 2015.
     A presentation on how to develop a digital preservation strategy. The blog and the slides included the following points:
  • Start small, and grow the service. Do it in stages
  • You already have knowledge of your collections and users – so build on that
  • Ask why you are doing digital preservation, who will benefit, and what are you preserving
  • Build use cases
  • Determine your own organisational capacity for the task
  • Reasons why metadata matters (intellectual control, manage and document
  • Determine your digital preservation strategies before talking to IT or vendors
The presentations also includes several scenarios that would address digital preservation needs incrementally and meet requirements for different audiences, such as archivists, records managers, and users:
  • Bit-level preservation (access deferred)
  • Emphasis on access and users
  • Emphasis on archival care of digital objects
  • Emphasis on legal compliance
  • Emphasis on income generation

Monday, December 14, 2015

Free OAIS Beginners Course – Update

Free OAIS Beginners Course – Update. Stephanie Taylor. DART Blog, University of London Computer Centre. 9 December 2015.
     An online course ‘A Beginners Guide to the OAIS Reference Model’ was launched in November for those interested in learning more about OAIS. The course remains open and free to anyone interested. "It’s been fantastic to see so much international engagement. We’ve also had a great cross-section of students in many roles from many kinds of organisations, including national memory institutions, higher education, cultural heritage, national and local government departments and the commercial sector." The blog has the link to sign up for the course.

Saturday, December 12, 2015

The FLIF format

The FLIF format. Gary McGath. Mad File Format Science blog. November 25, 2015.
     The post is a look at a new image format FLIF (Free Lossless Image Format) which claims to outcompress other formats for "any kind of image".  "It’s still a work in progress, and any new image format faces an uphill battle"against the existing well-established and well-funded formats. More information about the format is available at the FLIF website. The format is said to be "completely royalty-free and it is not encumbered by software patents." There is still work to do on support for additional metadata and color spaces.

Friday, December 11, 2015

oldweb.today website

oldweb.today.  Ilya Kreymer. Website. December 10, 2015.
     This is an interesting site that provides an emulator for various web browsers to search historic web sites. The tool Netcapsule, which can be used on the website oldweb.today, is built with open source tools that communicates with web archives. It allows you to browse "old web pages the old way with virtual browsers"; the user can navigate by url and by time. When the page is loaded "the old browser is loaded in an emulator-like setup" that can connect to the archive. Any archive that supports the CDX or Memento protocol interfaces can be a source. Full source code is available on Github.

Thursday, December 10, 2015

The Digital Preservation Network (DPN) Explained

The Digital Preservation Network (DPN) Explained. DuraSpace.org. December 8, 2015.
     The DPN digital preservation service guarantees academic institutions that scholarly resources will survive into the “far-future”. DPN is "the only large-scale digital preservation service that is built to last beyond the life spans of individuals, technological systems, and organizations". Like insurance, DPN provides a guarantee that future access to scholarly resources will be available in the event of any type of change in administrative or physical institutional environments. This is possible by establishing a redundant and varied technical and legal infrastructure at multiple administrative levels. DPN is a scholarly “dark archive” which means that the content stored is not actively used or accessed, but that it can be made available for use at any time from multiple digital storage facilities.

Academic institutions require that key aspects of their scholarly histories, heritage and research remain part of the record of human endeavor. DPN members will begin adding digital assets to the network through DuraCloud Vault, a cooperative development between DPN, DuraSpace and Chronopolis which will serve as the primary ingest point beginning in January.


The digital data revolution: top 5 storage predictions for 2016

The digital data revolution: top 5 storage predictions for 2016. Posted by Ben Rossi, Sourced from Nik Stanbridge, Arkivum. Information Age. December 9, 2015.
    The need for storage and archiving services keeps growing.
  1. Video footage will continue to require a lot of storage. "The requirement will be for very large amounts of highly secure, incorruptible long-term storage."
  2. Momentum will grow for outsourcing. "In-house IT will ‘let go’ and realise that the benefits of outsourcing to specialty archive storage providers will far outweigh concerns about security, access and control. IT will be happy not to have to worry about buying too much storage too early, or being caught short with not enough. They’ll realise that predictable costs and outsourcing resource-intensive headaches like upgrades and system migration will make a lot of sense. The clue is in the name: service. Using a managed service, as in-house IT departments already do for so many other services, will be a burden removed."
  3. Many will still confuse data archiving with data backup
  4. Scientific needs will outpace storage capacities
  5. Digital preservation will require ultra-reliable storage. One of the fundamental tenets of digital preservation is that it’s for the long-term
"With the rise of the Internet of Things, big data and personal data, there will be a huge and fundamental shift. And as organisations start to make things intelligent, this will become a major engine for creating new products and new services."


Wednesday, December 09, 2015

5 storage technologies I'm thankful for

5 storage technologies I'm thankful for.  Robin Harris. Storage Bits. November 27, 2015.
     The "basis of any civilization is the storage of its culture." Previous the culture was stored by physical means, books, art and such. As this changes to digital means, there are several storage technologies that the article mentions will help preserve culture long term:
  • Data encryption. "Because digital data is easy to copy and share, we need encryption to keep what is ours, ours alone."
  • The thousand year disc. The M-disc is the "only digital media with a lifespan as good as a well produced "book.
  • Scale-out object storage. Files that can be easily accessed from multiple servers, such as cloud services.
  • Advanced archive storage. "As we collect and store more information, archiving - not backup - becomes the critical success factor."
  • Solid-state storage. This "has revolutionized mobile device and enterprise storage."

Tuesday, December 08, 2015

Digital Preservation Handbook: revision

Digital Preservation Handbook: revision. Digital Preservation Coalition. October 2015.
     The original version of the handbook was compiled by Neil Beagrie and Maggie Jones; the revised 2nd edition of the Digital Preservation Handbook is being updated and released in stages between October 2015 and March 2016. The contents include:
  • Getting started
  • Organisational activities: Creating digital materials, acquisition and appraisal, retention and review, storage, legacy media, preservation planning, access, and metadata and documentation
  • Technical Solutions and Tools: Fixity and checksums, file formats and standards, information Security, cloud services, digital forensics, and persistent identifiers
  • Content-specific preservation: such as e-Journals, moving pictures and sound, and web-archiving
Sections to be added:
  • Digital preservation briefing
  • Institutional strategies
The revisions and additions will make this an even more valuable resource.

Related:

Monday, December 07, 2015

When the Technology Changes on You

When the Technology Changes on You.  Maha Bali. The Chronicle of Higher Education. November 9, 2015.
   An article about changing technology, ways to deal with it, efforts and costs that it takes to do that. The article was a result of Twitter changing the favorite icon from stars to hearts. How do we handle unexpected changes to the technology we use for work? Some of the changes encounter include when:
  • a website disappears. The Internet Archive may help, but you may want to have contingency plans
  • a tool changes or loses features. A change in features, such as the Twitter hearts can mean different things to different people and alter the way they work. "Tools distort what we are trying to express." Changes may require we discuss the situation with others but which can be beneficial.
  • a tool is down or unavailable. We should ensure that we have alternatives or backups
Some alternative plans may include hosting your own tools so that you can control them if there are changes. However, this does not always work, and there are other costs to consider.

Saturday, December 05, 2015

Digital Curation Coordinator, OhioLINK

Digital Curation Coordinator, OhioLINK. Online posting. December 3, 2015.
     The Digital Curation Coordinator will manage OhioLINK’s implementation of the Rosetta digital preservation platform; assist in developing required policies and procedures related to digital collections and preservation; represent OhioLINK at digital curation events; and interact with OhioLINK members on issues related to digital curation. 

Friday, December 04, 2015

March 2015 PASIG Meeting Presentations and Recent Webinars

March 2015 PASIG Meeting Presentations and Recent Webinars . Preservation and Archiving Special Interest Group (PASIG). March 11-13, 2015.
  Recent presentations and webinars from PASIG and ASIS&T are available on the PASIG site. These include:
  • March 2015 PASIG Meeting Presentations 
  • Tiered Adaptive Storage for Big Data and Supercomputing. Jason Goodman
  • Video Surveillance: Consuming I.T. Capacity At Significant Rates. Jay Jason Bartlett
  • Archive and Preservation for Collections Leveraging Standards Based Technologies and the Cloud. Brian Campanotti
  • What Would an Ideal Digital Preservation Technical Registry Look Like?. Steve Knight and Peter McKinney
  • Three Critical Elements of Long-Term Storage in the Cloud. Amir Kapadia
  • Policy Based Data Management. Reagan Moore
  • Digital Forensics and BitCurator. Christopher (Cal) Lee
  • The Essential Elements of Intelligently Managed Tiered Storage Infrastructures. Raymond Clarke
  • Implementing Sustainable Digital Preservation.
  • How to Access Your Digital Value at Risk:  An Introduction to the Digital Value at Risk.
  • Building Communities and Services in Support of Data-Intensive Research. Stephen Abrams
  • Storage Technology Trends for Archiving. Tom Wultich and Bob Raymond
  • Stewarding Research Data with Fedora and Islandora. Mark Leggott
  • Challenges of Digital Media Preservation in an Active Archive. Karen Cariani, David W. MacCarn
  • An Introduction to the National Digital Information Infrastructure and Preservation Program (NDIIPP) and its Digital Preservation Initiatives. Leslie Johnston
  • Digital Preservation in Theory and Practice:  A Preservation and Archiving Special Interest Group (PASIG) Boot Camp Webinar. Tom Cramer

Thursday, December 03, 2015

The direction of computing is only going in one way: to the cloud

The direction of computing is only going in one way: to the cloud.  Rupert Goodwins. Ars Technica. Nov 14, 2015.
     "The cloud is well on its way to becoming the standard model for IT." The cloud has changed the economics and usability of providing and using services, including the many mobile applications and services.

The most common cloud model is a mix of public cloud and private infrastructure: for convenience called the hybrid cloud. "The increasing use of hybrid cloud tech is a reflection of the economic drivers that pull more and more IT, corporate and consumer, towards the public cloud. The most fundamental driver is good old economy of scale." Because of the scale, "companies save three to four dollars on internal IT for every dollar they spend on shifting infrastructure and services to cloud."

New cloud providers may find it difficult to compete since companies such as Amazon and Google had such a head start: the "biggest challenges have been access to scalable software to build public and private clouds and networking technologies to connect them."
Even so, there have been and still are some big problems with cloud computing, reliability, the safety of your data, and security. "Cloud adoption is highly susceptible to perceptions of trust."  But the direction of computing is going towards the cloud. New opportunities are opening up and the constraints of pre-cloud computing are fading away."


Data-driven Decision Making at L. Tom Perry Special Collections

Data-driven Decision Making at L. Tom Perry Special Collections. Ryan Lee, Cory Nimer, Gordon Daines. Society of American Archivists. November 2015.
     Archivists are looking for new ways to identify materials to digitize. This case study looks at data-driven digitization and the decision process.  This study brings Web analytics and in-house use statistics together as a way to make more informed, data-driven decisions. "Digitizing and mounting materials publicly on the internet is a form of publishing, and success in publishing means knowing and targeting viewers." Unique Page Views "provide a sense of general interest, while circulation statistics suggest personal engagement with the materials themselves." These metrics provide a more accurate sense of the usefulness of the collections.

"The findings of this study reflect much of what Peter Hirtle suggests would happen as we digitize more and more of our special collections materials, when he stated that “[e]lectronic access will replace most uses of printed, paper copies, [and]… [t]he use of paper originals will decrease."  The study found that "digitization has often significantly reduced the use of originals in Perry Special Collections".


Wednesday, December 02, 2015

The Irony of Writing Online About Digital Preservation

The Irony of Writing Online About Digital Preservation. Meredith Broussard. The Atlantic. Nov 20, 2015.
     "Last month, The Atlantic published a lengthy article about information that is lost on the web. That story itself is in jeopardy." Article about the difficulties in preserving digital content long term, particularly from news sources that use a variety of methods and software to manage their information.  Some quotes from the article:
  • "There is no guarantee that we will be able to read today’s news on tomorrow’s computers. I’ve been studying news preservation for the past two years, and I can confidently say that most media companies use a preservation strategy that resembles Swiss cheese."
  • "News apps [interactive databases] aren’t being preserved because they are software, and software preservation is a specialized, idiosyncratic pursuit that requires more money and more specialized labor than is available at media organizations today.
  • “The challenges of maintaining digital archives over long periods of time are as much social and institutional as technological,” reads a report from 2003.
  • "When I started my research into news preservation, I thought there would be an easy technological solution. There isn’t. Every media company in the world grapples with the issue of digital archiving."
  • "Remember when Macromedia Flash was the new hot thing in journalism? Most of those elaborate Flash projects have disappeared now. They’re probably archived on Jaz drives in a storage room somewhere, next to boxes of color slides and piles of floppy disks and other outdated media. Future historians will likely lament this loss."
  • "The quantity and variety of information we now produce has outpaced our ability to preserve it for the future. Librarians are the only ones who are making sure that our collective memory is preserved. And they, along with small teams of digital historians elsewhere, are still trying to understand the scope of myriad challenges involved in modern preservation. If today’s born-digital news stories are not automatically put into library storehouses, these stories are unlikely to survive in an accessible way."
  • "The folks at the Internet Archive are thoughtful digital preservationists, and I am grateful every day for their work preserving our collective digital memory." "If I know exactly what web page I am looking for, the Internet Archive is very helpful."
  • "But if I don’t know exactly the web page that I want and exactly the day that the information appeared, I won’t be able to find the information in the Internet Archive."
  • "... we are losing digital history almost as soon as we make it.