The article recommends content authors use robots.txt and noarchive HTTP response headers to avoid sensitive information. Accidentally archiving sensitive information can result in loss of mementos within a WARC. Recommendations include:
- Use smaller storage devices to limit the problems if sensitive information is crawled;
- Develop a way to remove a sensitive memento from a WARC file
- Identify high-risk vs. low-risk archival targets within the Intranet.
The case study and the next steps proposed will help archive corporate memory, improve information longevity, and can help corporate archivists implement web archiving strategies.