Email preservation: How hard can it be? Edith Halvarsson.
Digital Preservation at Oxford and Cambridge. 7 July, 2017.
The post summarises highlights of the Digital Preservation Coalition’s briefing on email preservation. What is email? It is "an object, several things and a verb”, a heavily linked and complex object, like the web. "Retention decisions must be made, not only about text content but also about email attachments and external web links. In addition, supporting features (such as instant messaging and calendars) are increasingly integrated into email services and potential candidates for capture."
Email is also a cultural and social practice; capturing relationships and structures of communication is an additional layer to preserve.
What is being done, or can be done? Migration is the most common approach to email preservation. EML and Mbox, which is a family of formats, are the most common formats migrated to. They have different approaches to storing content. Others choose to unpack content which provides a way to display emails and normalise content within them. The emulation approach provides access to content within the original operating environment. Also, ePADD, an open source tool, provides functions for processing and appraisal of Mbox files, but ha other features
There are still questions and issues still to explore, particularly regarding web links. "Email archives may be more valuable to historians as they acquire critical mass". Some thing that institutions can do are:
- Participate with the Email Preservation Task Force
- Share your workflows to the Email Preservation Task Force and the community
- Run trial migrations between different email formats such as PST, Mbox and EML and blog about your finding
- Support open source tools such as ePADD and make them sustainable!