Thursday, February 11, 2016

To ZIP or not to ZIP, that is the (web archiving) question

To ZIP or not to ZIP, that is the (web archiving) question. Kristinn SigurĂ°sson. Kris's blog. January 28, 2016.
     This post looks at the question: Do you use uncompressed (W)ARC files? Many files on the Internet are already compressed and there is "little additional benefit gained from compressing these files again (it may even increase the size very slightly)."  For other files, such as text, tremendous storage savings can be realized using compression, usually about 60% of the uncompressed size. Compression has an effect on
disk or network access and on memory. But "the additional overhead of compressing a file, as it is written to disk, is trivial."

On the access side, the bottleneck is disk access but "compression can actually help!" It can save time and money and performance is barely affected. One exception may be with HTTP Range Requests which when accessing a WARC record would have to decompress the entire payload until it finds the requested item. A hybrid solution may be the best solution: "compress everything except files whose content type indicates an already compressed format."  This would also avoid a lot of unneeded compression / uncompression.

Wednesday, February 10, 2016

“High-res audio”

“High-res audio”. Gary McGath. Mad File Format Science Blog. February 8, 2016.
    High-res audio, sound digitized at 192,000 samples per second is not necessarily better than the usual 44,000. We can only hear sounds only in a certain frequency range, generally 20 to 20,000 Hertz.

"The sampling rate of a digital recording determines the highest audio frequency it can capture. To be exact, it needs to be twice the highest audio frequency it records." Delivering playback audio at a higher rate offers no benefit and may introduce problems. Another important part of audio is the number of bits per sample, usually 16 bits, but higher-res audio often offers 24 bits. This "isn’t likely to cause any problems", but it doesn't necessarily provide a benefit.   A bigger problem is "over-compressed and otherwise badly processed files".  It is important to not skimp on quality.

Monday, February 08, 2016

Keep Your Data Safe

Love Your Data Week: Keep Your Data Safe. Bits and Pieces.  Scott A Martin. February 8, 2016.
     The post reflects on a 2013 survey of 360 respondents:
  • 14.2% indicated that a data loss had forced them to re-collect data for a project.  
  • 17.2% indicated that they had lost a file and could not re-collect the data.
If this is indicative of  the total population of academic researchers, then there is a lot of lost research time and money due to lost data. Some simple guidelines can greatly reduce the chances of catastrophic loss if steps are included in your own research workflow:
  1. Follow the 3-2-1 rule for backing up your data: store at least 3 copies of each file (1 working copy and 2 backups), 2 different storage media and at least 1 offsite copy 
  2. Perform regular backups
  3. Test your backups periodically
  4. Consider encrypting your backups.  Just make sure that you’ve got a spare copy of your encryption password stored in a secure location!  

New digital preservation solution from Arkivum

New digital preservation solution from Arkivum, shaped to grow with your data. Nik Stanbridge. Arkivum Press release. January 21, 2016.
     Arkivum is launching a new cloud-based digital preservation and archiving service with Artefactual Systems Inc. of Vancouver. "Arkivum/Perpetua is a cost-effective, comprehensive, fully hosted and managed digital preservation and public access solution that uses Archivematica and AtoM (Access to Memory) services in the cloud."

In a survey of archivists and data curators, 87% said "file format preservation and data integrity were important elements to their digital preservation workflow. And a third of respondents stated that they would be using a cloud-based solution for their digital preservation data."

Saturday, February 06, 2016

MRF for large images

MRF for large images. Gary McGath. Mad File Format Science Blog. January 21, 2016.
NASA, Esri speed delivery of cloud-based imagery data. Patrick Marshall. GCN. Jan 20, 2016.
     NASA and Esri are releasing to the public a jointly developed raster file format and a compression algorithm designed to deliver large volumes of image data from cloud storage.  The format, called MRF (Meta Raster Format) together with a patented compression algorithm called LERC, can deliver online  images ten to fifteen times faster than JPEG2000. The MRF format breaks files into three parts which can be cached separately. The metadata files can be stored those locally so users can "examine data on file contents and download the data-heavy portions only when needed". This would help to minimize the number of files that are transferred. The compression allows users to get faster performance, lower storage requirements, and they estimate the cloud storage costs would be about one-third as much as traditional file-based enterprise storage. An implementation of MRF from NASA is available on GitHub and an implementation of LERC is on GitHub from Esri.

Friday, February 05, 2016

Developing a Born-Digital Preservation Workflow

Developing a Born-Digital Preservation Workflow. Jack Kearney, Bill Donovan. April 8, 2014.
     Presentation that looks at developing a systematic approach to preserving digitally born collections. The example from Boston College are the Mary O’Hara papers. This was an opportunity for a collaborative project involving the Digital Libraries, Archives, and the Irish Music Center.
  • Important elements of the workflow:
  • Chain of Custody, 
  • Digital Forensics, 
  • Computed initial checksums, 
  • File/folder names, 
  • Local Archival Copies, Distributed Digital Preservation
“Digital forensics focuses on the use of hardware and software tools to collect, analyze, interpret, and present information from digital sources, and ensuring that the collected information has not been altered in the process.” The presentation has some specific steps and procedures in ways to not alter the information, including multiple copies, write blockers, and such. In working with external drives, they would build and output an inventory taken with this Unix command:
     :  find directory-name  -type f -exec ls -l {} ; >c:\data\MOH\inventory.txt

Local conventions regarding naming files and folders:
  • Use English alphabet and numbers 0 - 9
  • Avoid punctuation marks other than underscores or hyphens.
  • Do not use spaces.
  • Limit file/folder names to 31 characters, including the 3 digit extension . Prefer shorter names.
  • Decision: They may remediate folder and file names, but only for the working copies.
They also look for files that need actions taken:
  • Any files off-limits or expendable? System files,
  • Personally Identifiable Information (PII)
  • Unsupported Formats (Can normalize using Xena)
  • They also use a variety of tools, such as: FITS,  JHove 
Important to keep track of digital preservation actions:
  • File migrations
  • Obsolete file formats
  • Proprietary file formats
  • Metadata changes

Wednesday, February 03, 2016

Policy Planning from iPres

Policy Planning from iPres. Alice Prael. Blog. November 5, 2015.
     Report on the Policy and Practice Documentation Clinic at iPres organized by Maureen Pennock and Nancy McGovern. SCAPE has a collection of published preservation policies that she is using to create a policy framework. The Policy and Practice Clinic showed the importance of taking the time to create a policy and not trying to do it all at once. Some notes from the post:
  • Create a Digital Preservation Principles
  • Include key stakeholders when working with the principles document, early and often
  • Write what digital preservation actions are happening now
  • Start writing a Digital Preservation Plan. Nancy McGovern: A policy is ‘what we do’ and a plan is ‘what we will do.’
  • Create Procedure Documents to show how to follow the principles
  • Have the key stakeholders decide if the procedures are realistic
Some additional notes on the Policy and Practice Clinic:
  • If your institution doesn’t like the word ‘preservation’ then use ‘long term access.’ 
  • Do what is needed to get the buy-in from the stakeholders. 
  • Make the technology enforce the digital preservation policy for you.
  • People are much more likely to perform these preservation tasks if the system doesn’t give them a choice.
More information on iPres here:

Tuesday, February 02, 2016

MDISC Archive Service

MDISC Archive Service. Website. Millenniata. January 29, 2016.
    The MDISC Archival Beta tool is now available. It automatically preserves all your photos and videos, past, present, and future by engraving them on M-Discs. The service archives files that are in a designated internet service, currently Google Photo, and archives the files to M-Discs, which is then delivered to the owner. [This is a service that I have been trying out for a few months - Chris.]

The free limited-time beta service gives you three months to try out unlimited file archiving on M-Discs. "Ultimate peace-of-mind comes when you can hold your data in your hands.  Your photos can't be lost, corrupted, hacked, or erased, and they'll last forever."

Included on the website is a video that shows people opening digital files they had saved for 8 years or more. One-third of Americans have lost photos and video and don't even know it yet. The automated archival service makes it easier to archive the files.

Monday, February 01, 2016

Preserving and Emulating Digital Art Objects

Preserving and Emulating Digital Art Objects. Oya Rieger, et al. National Endowment for the Humanities White Paper. November 2015, posted December 11, 2015. 202pp. [PDF]
     This white paper describes the media archiving project's findings, discoveries, and challenges. The goal is the creation of a preservation and access practice as well as sustainable, realistic, and cost-efficient service frameworks and policies. The project was looking at new media art but it should also help inform other types of complex born-digital collections. It aims to develop scalable technical frameworks and associated tools to facilitate enduring access to complex, born-digital media objects.

Interactive digital assets are much more complex to preserve and manage than regular digital media files. A single interactive work can include a range of digital objects, dependencies, different types and formats, applications and operating systems.  The artwork can consist of "sound recordings, digital paintings, short video clips, densely layered audiovisual essays that the user navigates and explores with the clicks and movements of a computer mouse. Expansive and complex, the artwork may include many sections, each with its own distinct aesthetic, expressed through rich sound and video quality and intuitive but non-standard modes of interactivity." The interactive and technological nature of these assets poses serious challenges to digital media collections.

About 70 percent of the project artworks could not be accessed at all without using legacy hardware. The project team realized that operating system emulation could be a viable access strategy for those complex digital media holdings.

Project Goals
  1. Identify significant properties needed to preserve and access of new media objects.
  2. Define a metadata framework to support capture of technical and descriptive
    information for preservation and reuse.
  3. Create SIPs that can be ingested into a preservation repository.
  4. Explore resource requirements, staff skills, equipment needs, and associated costs.
  5. Help understand “preservation viability” for complex digital assets
The project team analyzed content to determine classes of material, and setup a digital forensics workstation using BitCurator and the AVPreserve Fixity tool to monitor the stability of directories. The final metadata structure consisted of a  combination of MARCXML for the descriptive metadata, Digital Forensics XML (DFXML) for the technical metadata, PREMIS XML for the preservation metadata and also unstructured descriptive files.

"Emulation seems an excellent and flexible approach to providing fully interactive access to obsolete artworks at very reasonable quality." However there are issues with using emulation as an archival access strategy:
  • preserve emulators must be preserved as well as artworks.
  • creating archival identities for emulators is difficult and documentation tends to be inconsistent.
  • emulators will eventually become obsolete with new operating systems 
  • new emulators must be created
  • no emulator can provide a fully “authentic” rendering of a software-based artwork.
"The key to digital media preservation is variability, not fixity." It is important to find ways to capture the experience so that future generations can see how the digital artworks were created, experienced, and interpreted.

Artists have increasing access to tool for creating complex art exhibits and objects, but it is "nearly impossible to preserve these works through generations of technology and context changes." Digital curation is more important that ever. Access is the keystone of preservation. The appendices include  Emulation Documentation, the Pre-Ingest Work Plan, and Artwork Classifications:
  • Structure of the classifications 
  • Browser-Based Works 
  • Virtual Reality Components 
  • Executables in Works 
  • Macromedia and Related Executables
  • HFS File System

Saturday, January 30, 2016

Digital Preservation: A Technologist's Perspective

Digital Preservation: A Technologist's Perspective. Matthew Addis. DPC conference. Arkivum. 22 January 2016.
     A presentation on the application of technology, what it can do, how to use it, and the importance of using technology as a tool to do preservation. It also looks at the question: what I wish I knew before I started digital preservation. One thing is that technology is not the place to start: "digital preservation is primarily about people. It’s about people having the right skills. It’s about people having the right plans. It’s about people working as a team and doing something that’s more than they each could do on their own. Technology helps people do their job and people are the place to start."

It doesn't always help to look at what the large institutions are doing; they have more people and money to build complex digital preservation programs. Sometimes this turns into "preservation paralysis". Some quotes:
  • If you think that you’re not able to do enough or ‘do it properly’, then this can result in doing nothing because this feels like the next best thing. 
  • But doing nothing is the worst thing you can do. Delays cause digital data to become derelict. Neglect has serious consequences in the digital world – it’s not benign. A decision to do nothing or to delay action can be the equivalent of a digital death sentence. Or, if nothing else, it just increases the cost.
  • In the end, it’s people that are the biggest risk to digital content surviving into the future. People thinking that preservation is too hard, too expensive or tomorrow’s problem and not today’s.
  • "Digital preservation is an opportunity." It allows the content to be used by others and to become an asset. 
A frugal and minimal strategy to start with could be:
  1. Start with knowing in detail what digital content you have. 
  2. Decide what is important and store it in a safe place. The article uses the 3-2-1 rule: at least three copies in three separate locations with two online and one offline.
  3. Build a business case to get funding for preservation. If you don't have a budget, you can't take care of the content.  
  4. From there you can decide what else to do.
There are lots of tools, technology and guidance available. The best thing is to get started and not wait. Realize that digital preservation is an ongoing activity and doesn’t stop.

Tuesday, January 26, 2016

Cloud-supported preservation of digital papers: A solution for special collections?

Cloud-supported preservation of digital papers: A solution for special collections? Dirk Weisbrod. Liber Quarterly. January 2016.
     A problem for Special Collections is that in many cases digital media have replaced paper for many writers. Digital papers are "difficult to process using established digital preservation strategies, because of their individual and unique nature". The article suggests that document creators should be involved in the preservation process, and that special collections should look at the cloud as a way to solve the problem.

The relatively short durability of digital media is in contrast to the durability of paper. An example in the article shows that data were lost from an Atari computer after a short period of time. Both paper and digital media can be destroyed or damaged, but the potential loss of digital media is much higher since there are many software and hardware components that can fail. The computer skills of the writers can also influence the degree of preservation of the personal digital documents. "To minimize those risks is the task of digital preservation".

A writer’s archive of digital objects (documents, email correspondence, texts, photos, and such) may be scattered over a variety of social networks and web services. This will affect the acquisition of the content by an archive, which would have a problem of identifying and acquiring the digital objects, including accessing the online services, which may be passworded. 

Archives and special collections need to manage these processes for digital preservation and develop a preservation strategy that "matches with the characteristics of digital papers". This needs to change from a “custodial” to a “pre-custodial” view and work with writers and their lifelong personal archives. Writers should contribute to the digital preservation of their own works. Some approaches to consider:
  • Regular captures of the creators’ digital data by preservation specialists to be transferred directly into a managed digital repository.
  • The periodic transfer of data from old hardware and media to a special collection.
  • Have preservation specialists help writers maintain their digital materials
  • Provide self archiving of email archives
  • IT-supported self-archiving and automated data transfers. The solution could includes services such as
    • email archive, like Mailbox
    • data storage, like Dropbox 
    • website hosting
These approaches could help solve the problem of ongoing archiving while the original objects remain on the creator’s computer and continue to be updated. Another potential problem if writers use cloud services is that accounts may be cancelled if inactive. Archives and Special Collections should consider the cloud not as a problem but as an opportunity to work with authors. "By establishing a cloud, special collections get an instrument that provides writers with a reasonable working environment and, at the same time, enables the preservation of their personal digital archives. The time span between an object’s creation and its preservation, this critical factor of digital preservation, reduces to a minimum."

Monday, January 25, 2016

Figshare Joins the Digital Preservation Network

Figshare Joins the Digital Preservation Network to ensure survival, ownership and management of research data into the future. Carol Minton Morris.  DuraSpace. January 20, 2016.
     "Figshare, a platform that supports the management of research content, is the first research data repository to join the DPN Federation". The research data from Figshare will be deposited in DPN through the DuraSpace DuraCloud Vault node and this will provide long-term access to scholarly resources.

Saturday, January 23, 2016

Exactly: A New Tool for Digital File Acquisitions

Exactly: A New Tool for Digital File Acquisitions. AVPreserve News. January 13, 2016.
     A new tool, Exactly, has been developed to help to acquire born digital content from donors and to start establishing provenance and fixity early in the acquisition process. The tool:
  • can remotely and safely transfer any born-digital material from a sender to a recipient 
  • uses the BagIt File Packaging Format
  • supports FTP transfer, network transfers
  • can be integrated into sharing workflows using Dropbox or Google Drive
  • metadata templates can be created for the sender to fill out before submission
  • can send email notifications with transfer data and manifests when files have been delivered 

Monday, January 18, 2016

Exploring the potential of Information Encapsulation techniques

Exploring the potential of Information Encapsulation techniques. Anna Eggers. Pericles  Blog. 30 November 2015.
     Information Encapsulation is the aggregation of information that belongs together and can be implemented at different states of the information life cycle. For Digital Preservation this usually means pairing a digital object with its metadata. The PeriCAT open-source tool provides encapsulation techniques and mechanisms that help ensure the information remains accessible even if the digital object leaves its creation environment. The tool supports the creation of self-describing objects and the long-term reusability of information.

The two main categories: Information Embedding and Packaging. Packaging refers to the aggregation information, like files or streams, as equal entities stored in an information container. As opposed to this, information embedding needs a carrier information entity in which the payload information will be embedded.

Packaging techniques: adding files with the information into simple archive packages such as bagit, zip and tar. Metadata files, such as METS and OAI-ORE, can be added to the archive packages. The ensures that the packaged objects can be restored so that the restored objects are identical to the original objects and that they can be verified by a checksum.

Embedding techniques: Making a distinction between the information that is the format of the item, and the message information which is embedded into the object itself. This includes: Digital Watermarking, Steganography (hiding messages), and attaching files or text to objects.

PeriCAT (PERICLES Content Aggregation Tool) is a framework that allows the encapsulation and decapsulation of information. (Decapsulation is the process to separate encapsulated entities from each other.) Each of the techniques have different features; the technique to be used should be chosen based on the specified requirements.


Friday, January 15, 2016

DOTS: Almost a datalith

DOTS: Almost a datalith. Gary McGath. Mad File Format Science. December 29, 2015.
     "The notion that archivists will replace outdated digital media every decade or two through the centuries is a pipe dream. Records have always gone through periods of neglect, and they will in the future. Periods of unrest will happen; authorities will try to suppress inconvenient history; groups like Daesh will set out to destroy everything that doesn’t match their worldview; natural disasters will disrupt archiving." DOTS, Digital Optical Technology System, which is burned on tape, can store digital images in any format and also allows them to be recorded as a visual representation. DOTS encodes data physically on an archival tape coated in a phase-change alloy which is resistant to temperature extremes, electromagnetic pulses, and other common environmental hazards. The data, which may include words, images, and digital information, is which is written using a laser that changes the alloy’s index of refraction. "It’s essential to have something like this for reliable long-term data archives. The people who think data will reliably be passed evermore from curator to curator are pleasantly optimistic, but history has never worked that way."

Thursday, January 14, 2016

Files on nearly 200 floppy disks belonging to Star Trek creator recovered

Files on nearly 200 floppy disks belonging to Star Trek creator recovered. Megan Geuss.  Ars Technica.  Jan 4, 2016.
     Information from nearly 200 floppy disks that belonged to Gene Roddenberry has been recovered. Roddenberry,  Star Trek creator, used the 160KB disks in the 1980's to store his work and "to capture story ideas, write scripts and notes."  The 5.25-inch floppy disks, found several years after the death of Roddenberry, were used with two custom computers and a custom-built OS. The computers were no longer available for recovery use, so the floppies were sent to DriveSavers, which wrote software that could read the disks.

Wednesday, January 13, 2016

Now What You Put on the Internet Really Could Last Forever

Now What You Put on the Internet Really Could Last Forever. Ryan Steadman. Observer Culture. January 5, 2016.
     Digital art institution Rhizome has won a two-year grant from the Andrew W. Mellon Foundation to continue development of Webrecorder, a newly developed archiving tool for the web. The tool, Webrecorder, which will be free to the public, provides the ability to capture and play back dynamic web content and thus improve digital social memory. An open source version of Webrecorder is already available at, where users are invited to build their own archive. However,  "further development is needed to make it into the comprehensive archive Rhizome would like to build."


Monday, January 11, 2016

Digital Preservation Decision Form

Digital Preservation Decision Form. Chris Erickson. Harold B. Lee Library. January 11, 2016. Updated.   
     This is the production version of our Digital Preservation Decision Form with the Instructions for completing the form. The form is used by subject specialists (curators, subject librarians, or faculty responsible for collections) to determine:
  • which materials should be included in our Rosetta Digital Archive; 
  • who can access content in the Digital Archive;
  • preservation metadata and updates;
  • preservation actions needed;
  • direction on format migration options;
  • whether or not the digital collection is a high preservation risk. 
Standard practices include creating three preservation copies, Rosetta, tape, stored in the granite vaults, and a copy on M-Disc. The form was created to help subject specialists determine what should be preserved, even if they are unaware of digital preservation procedures and practices. In practice, we complete the form during an interview with new subject specialists. This version includes instructions to help complete the form.

Friday, January 08, 2016

World bought 143 exabytes of storage in Q3, mostly spinning rust

World bought 143 exabytes of storage in Q3, mostly spinning rust. Simon Sharwood. The Register. 4 Dec 2015.
The world bought 143 exabytes of storage in 2015's third quarter, which is expected to be about 500 exabyte for the year. Solid state disks accounted for 26.22 million drives. The article asks, how much of it is properly backed up?

Thursday, January 07, 2016

Digital Preservation: A Planning Guide for the Five Colleges

Digital Preservation: A Planning Guide for the Five Colleges. Five Colleges Consortium website. 2014. [PDF]
     This Digital Preservation Planning Guide is designed to help institutions who are starting their digital preservation activities. The first part of the Guide is a checklist of the six essential action items for starting a digital preservation program:
  1. Create a digital preservation policy
  2. Identify and document workflows, standards, and best practices
  3. Identify and document short-term data security practices
  4. Manage digital objects
  5. Identify and capture metadata necessary for preservation
  6. Develop a migration plan
The second part provides "explanations, examples, and advice for completing the six action items" to help with the planning and to find common ground for potential future collaboration.

Create a digital preservation policy which will help achieve several fundamental digital preservation goals:
  • Define digital preservation and the scope of preservation efforts.
  • Get administrator buy-in, which becomes a tool for ensuring institutional support and program sustainability.
  • Encourage the institution to review current digital programs and define the scope of future efforts.
General Steps:
  • Advocate for the necessity of a digital preservation policy
  • Organize all stakeholders to form a policy committee
  • Develop the process for approving the policy
  • Review example policies
  • Draft the policy
  • Solicit feedback from campus community and review periodically
  • Schedule periodic review
  • Identify and document workflows, standards, and best practices
  • Identify and Document short-term data security practices
  • Manage digital objects
  • Identify stored digital objects and their components
  • Determine which components should be preserved
  • Ensure integrity of digital objects regardless of system used to store or provide access
  • Identify and capture metadata necessary for preservation
  • Develop a migration plan

Tuesday, December 29, 2015

Storage For The Next 5,000 Years

Storage For The Next 5,000 Years. Tom Coughlin. Forbes. Dec 15, 2015.
     We are creating as much information annually as mankind generated from the beginning of civilization to a few years ago. Some of the data is temporary while other data has longer-term value and may be useful in the future. As we generate and save more data the question is whether we can actually keep the data the long term with hardware or format obsolescence. "But even if data is transferred from older formats/media to modern formats regularly natural processes driven mostly by thermal energy can destroy data over time. The longer the data is kept the greater the chance of data corruption". 

Keeping data for a long time can be expensive and requires management and multiple copies of data on different hardware. While large organizations with valuable content can afford to protect their data, smaller organizations or consumers will find it difficult, though one way may be to move the data to managed cloud storage data centers where it can be managed by professionals. "Carrying data into the far future will require careful management of data to support multiple copies and continuous detection and elimination of data corruption". On-line archives may be able to provide access and archived data.

Wednesday, December 23, 2015

Personal Digital Archiving

Personal Digital Archiving.  Gabriela Redwine. DPC Technology Watch Report 15-01. December 2015. [PDF]
     This excellent report looks at some of the  key challenges people face in managing and storing their digital files. It "stresses the importance of preserving personal files" that include physical, digitized and born-digital materials. The term ‘personal digital archiving’ or ‘Save your digital stuff!’ refers to how people keep track of their digital files, where they store them, and how the files are described and organized.

The report reviews the archiving issues and offers guidance and resources to help individuals be proactive and save their digital content. It also argues for the "importance and urgency of preserving personal files while also acknowledging the difficulty of managing digital media and files". Personal items increasingly exist only in digital format. "This brings a new understanding of what letters, photos and other sources look like in the digital age, and raises important questions about how to manage these personal items today and how to preserve them for future generations."

"Thinking of a personal collection of digital files as ‘archives’ places emphasis on the larger context within which those digital files exist. The records of people’s lives are intrinsically important and worth preserving." Social media archiving necessarily requires a considerable investment of resources so it is important to choose which social media services should be archived.  Some key threats to a personal digital archive:
  • old hardware and software
  • lack of secure storage and backup 
  • natural and man-made disasters 
  • neglect of content
  • loss of cloud-based host or service provider
  • lack of planning
  • death of an individual
The report lists recommendations (quick wins, more effort, maximum effort) for the threats listed. Some of these are:

Recommendations: addressing key threats to personal digital files
  • Choose software that is well supported and creates files that can be read by a variety of different programs.
  • Develop file naming conventions that are easy to remember and apply these consistently.
  • Create multiple back-up copies and store them in different geographical locations.
  • Test your back-up copies to make sure they are accessible and contain what you intend them to.
  • Transfer files to new media every 2 to 4 years.

Recommendations: taking good care of a personal digital archive
  • Choose high-quality storage media and refresh it regularly.
  • Be proactive about refreshing storage media, replacing outdated equipment before it
  • fails, and not relying exclusively on one service provider or storage solution.
  • Follow best practice when naming files.

With digital preservation, and especially with creating and maintaining a personal digital archive the hardest part is deciding how to start. Start first by making a back-up copy of your files, then address questions such as these:
  • Which files would you miss most if they suddenly disappeared?
  • What qualities about those files are most important – for example, does it matter if the formatting of a word-processing document changes if the text is still readable?
  • Do your digital photos include important descriptive or contextual information that you need to use a particular program to see?

Monday, December 21, 2015

OhioLINK Adopts Ex Libris Rosetta for Digital Preservation

OhioLINK Adopts Ex Libris Rosetta for Digital Preservation. Ex Libris. Press release. December 21, 2015.
     OhioLINK has selected the Ex Libris Rosetta digital management and preservation solution for 120 academic libraries plus the State Library of Ohio. Rosetta will ensure long-term access to the OhioLINK Electronic Journal Collection (EJC), Electronic Book Collection (EBC), Electronic Theses and Dissertations (ETD) Center, and Digital Resource Commons (DRC) collections. - OhioLINK sought a preservation system based on the Open Archival Information System (OAIS) reference model that could integrate with its existing content management systems and support a wide range of processing workflows. As a large and complex consortium, OhioLINK required a solution that could be implemented and maintained in a way that suits a wide variety of content.

Friday, December 18, 2015

Digital preservation in 2016: 5 predictions

Digital preservation in 2016: 5 predictions. Jon Tilbury. ItProPortal. December 15, 2015.
     The article presents five trends that he sees in digital preservation from his point of view:
  1. Old analog media and file formats will continue to become obsolete. Betamax, "think of the floppy disk, the CD-ROM, Lotus 1-2-3 or WordStar." Digitize content to digitally preserve content against obsolescence.
  2. Moving critical long-term and permanent digital records to the safety of secure and open archival repositories, where records can be "future-proofed for the long-term".
  3.  Digital preservation will go mainstream. Cultural organizations have been managing and preserving digital content; now many commercial and government organizations are now understanding the need for long-term digital preservation.
  4. Use of the cloud for preserving digital content will continue to increase.
  5. Technology refresh cycles will get faster. The Digital Dark Age debate has helped to move digital preservation to a higher level.

Thursday, December 17, 2015

The Future of the Humanities in a Digital Age

The Future of the Humanities in a Digital Age. SDSU News. December 15, 2015.
    In a preview of  a January lecture, Vint Cerf was asked about his comment of a "digital dark age" in that storage formats could become incompatible with future hardware technologies. His response  was "I am deeply concerned that people take "digital preservation" to mean digitizing fixed text and imagery. What I worry about is that this format will prove to be unreliable if the software that interprets it is no longer available. We really need to figure out how to assure that digitized content can be preserved regardless of format."

Wednesday, December 16, 2015

5 Open Source Digital Preservation Tools to Assist Enterprise Archiving

5 Open Source Digital Preservation Tools to Assist Enterprise Archiving. Christopher J. Michael. Paragon Solutions. December 15, 2015.
     General article about digital preservation and some useful tools. "Digital archiving and preservation are needed to ensure the authenticity, integrity, and protection of electronic records despite limited resources and a constant stream of new complex technologies. "
  • "Digital preservation is the foundation of enterprise archiving."
  • "Electronic records are archived when they have long-term retention needs in order to fulfil legal, business and regulatory requirements."
  • A digital archive is a repository to store collections of digital objects to provide long-term access to the information.
There are some useful tools to help with the challenges of archiving and obsolescence:
  1. Matchbox: software to identify duplicate images.
  2. DROID: identify and standardize file formats and metadata extraction.
  3. Xena (XML Electronic Normalising for Archives): detect the file formats of objects and convert them into into open formats.
  4. ePADD: supports the appraisal, ingest, processing, discovery, and delivery of email archives.
  5.  Web Curator Tool: a tool for harvesting websites for archiving with descriptive metadata.
A clearly documented digital preservation policy that includes standard file formats and that is followed consistently will help ensure that objects in the archive will be available long term.

Tuesday, December 15, 2015

Building a Digital Preservation Strategy

Building a Digital Preservation Strategy. Edward Pinsent. DART Blog, University of London Computer Centre. 23 November 2015.
     A presentation on how to develop a digital preservation strategy. The blog and the slides included the following points:
  • Start small, and grow the service. Do it in stages
  • You already have knowledge of your collections and users – so build on that
  • Ask why you are doing digital preservation, who will benefit, and what are you preserving
  • Build use cases
  • Determine your own organisational capacity for the task
  • Reasons why metadata matters (intellectual control, manage and document
  • Determine your digital preservation strategies before talking to IT or vendors
The presentations also includes several scenarios that would address digital preservation needs incrementally and meet requirements for different audiences, such as archivists, records managers, and users:
  • Bit-level preservation (access deferred)
  • Emphasis on access and users
  • Emphasis on archival care of digital objects
  • Emphasis on legal compliance
  • Emphasis on income generation

Monday, December 14, 2015

Free OAIS Beginners Course – Update

Free OAIS Beginners Course – Update. Stephanie Taylor. DART Blog, University of London Computer Centre. 9 December 2015.
     An online course ‘A Beginners Guide to the OAIS Reference Model’ was launched in November for those interested in learning more about OAIS. The course remains open and free to anyone interested. "It’s been fantastic to see so much international engagement. We’ve also had a great cross-section of students in many roles from many kinds of organisations, including national memory institutions, higher education, cultural heritage, national and local government departments and the commercial sector." The blog has the link to sign up for the course.

Saturday, December 12, 2015

The FLIF format

The FLIF format. Gary McGath. Mad File Format Science blog. November 25, 2015.
     The post is a look at a new image format FLIF (Free Lossless Image Format) which claims to outcompress other formats for "any kind of image".  "It’s still a work in progress, and any new image format faces an uphill battle"against the existing well-established and well-funded formats. More information about the format is available at the FLIF website. The format is said to be "completely royalty-free and it is not encumbered by software patents." There is still work to do on support for additional metadata and color spaces.

Friday, December 11, 2015 website  Ilya Kreymer. Website. December 10, 2015.
     This is an interesting site that provides an emulator for various web browsers to search historic web sites. The tool Netcapsule, which can be used on the website, is built with open source tools that communicates with web archives. It allows you to browse "old web pages the old way with virtual browsers"; the user can navigate by url and by time. When the page is loaded "the old browser is loaded in an emulator-like setup" that can connect to the archive. Any archive that supports the CDX or Memento protocol interfaces can be a source. Full source code is available on Github.

Thursday, December 10, 2015

The Digital Preservation Network (DPN) Explained

The Digital Preservation Network (DPN) Explained. December 8, 2015.
     The DPN digital preservation service guarantees academic institutions that scholarly resources will survive into the “far-future”. DPN is "the only large-scale digital preservation service that is built to last beyond the life spans of individuals, technological systems, and organizations". Like insurance, DPN provides a guarantee that future access to scholarly resources will be available in the event of any type of change in administrative or physical institutional environments. This is possible by establishing a redundant and varied technical and legal infrastructure at multiple administrative levels. DPN is a scholarly “dark archive” which means that the content stored is not actively used or accessed, but that it can be made available for use at any time from multiple digital storage facilities.

Academic institutions require that key aspects of their scholarly histories, heritage and research remain part of the record of human endeavor. DPN members will begin adding digital assets to the network through DuraCloud Vault, a cooperative development between DPN, DuraSpace and Chronopolis which will serve as the primary ingest point beginning in January.

The digital data revolution: top 5 storage predictions for 2016

The digital data revolution: top 5 storage predictions for 2016. Posted by Ben Rossi, Sourced from Nik Stanbridge, Arkivum. Information Age. December 9, 2015.
    The need for storage and archiving services keeps growing.
  1. Video footage will continue to require a lot of storage. "The requirement will be for very large amounts of highly secure, incorruptible long-term storage."
  2. Momentum will grow for outsourcing. "In-house IT will ‘let go’ and realise that the benefits of outsourcing to specialty archive storage providers will far outweigh concerns about security, access and control. IT will be happy not to have to worry about buying too much storage too early, or being caught short with not enough. They’ll realise that predictable costs and outsourcing resource-intensive headaches like upgrades and system migration will make a lot of sense. The clue is in the name: service. Using a managed service, as in-house IT departments already do for so many other services, will be a burden removed. "
  3. Many will still confuse data archiving with data backup
  4. Scientific needs will outpace storage capacities
  5. Digital preservation will require ultra-reliable storage. One of the fundamental tenets of digital preservation is that it’s for the long-term
"With the rise of the Internet of Things, big data and personal data, there will be a huge and fundamental shift. And as organisations start to make things intelligent, this will become a major engine for creating new products and new services."

the need for more and more storage and archiving services keeps growing. - See more at: