This report looks at the related issues of preserving social media. Institutions collecting this type of media need new approaches and methods. The report looks at "preserving social media for long-term access by presenting practical solutions for harvesting and managing the data generated by the interactions of users on web-based networking platforms such as Facebook or Twitter." It does not consider blogs. "Helen Hockx-Yu defines social media as: ‘the collective name given to Internet-based or mobile applications which allow users to form online networks or communities’.
Web 1.0 media can be harvested by web crawlers such as Heritrix; Web 2.0 content, like social media platforms, is more effectively archived through APIs. This is often an extension of an institution's web archiving. Transparency and openness will be important when archiving content. APIs allow developers to call raw data, content and metadata directly from the platform, all transferred together in formats like JSON or XML.
Maintaining long-term access to social media data faces a number of challenges, such as working with user-generated content, continued access to social media data, privacy issues, copyright infringement issues, and having a way to maintain the linked, interactive nature of most social media platforms. There is also "the challenge of maintaining the meaning of the social media over time, which means ensuring that an archive contains enough metadata to provide meaningful context." There are also third-party services and self-archiving services available.
Social media is vulnerable to potential loss. The report quotes one study which looked at "the lifespan of resources shared on social media and found that ‘after the first year of publishing, nearly 11% of shared resources will be lost and after that we will continue to lose 0.02% per day’."
Some other quotes:
- Overall, the capture and preservation of social media data requires adequate context.
- Capturing data, metadata, and documentation may not provide enough context to convey user experiences with these platforms and technologies.
- When considering the big picture, however, the preservation of social media may best be undertaken by a large, centralized provider, or a few large centralized providers, rather than linking smaller datasets or collections from many different institutions.