PDF/A can be used as a file format, but it can also be used as OAIS SIP containers. The PDF/A open standards can "simplify digitization process, reduce digitization cost, improve production substantially and build more confidence for preservation and access." PDF/A can be used as an Archival Information Package container.
The three main goals of PDF/A are to:
- provide a way to present the appearance of documents independent of the tools and systems used
- provide a framework for recording the context and history of electronic documents in the metadata
- define a framework for representing the logical structure of electronic documents within conforming files
A typical SIP may consist of a directory containing the following information"
- Content:
- Preservation master files (such as TIFF images files).
- Access files (such as a PDF or JPG / JPG2000 files).
- Other content (such as OCR data).
- Preservation description:
- Preservation metadata in the TIFF header
- Other structural and technical metadata
- Checksum files.
- Packaging information:
- Directory and File naming, structural metadata.
- Descriptive information:
- Descriptive metadata saved in digital management system, catalog, or textual/XML files.
Master file formats should be non-proprietary, open and documented international standards that are commonly used. The files should be unencrypted, and should be uncompressed or else use lossless compression. The author of the article recommends using PDF/A as the preferred file format for text and image files, and possibly using it as an OAIS SIP container. The author shows how PDF/A is a better file format than the currently preferred TIFF or JPEG2000 formats.
There are several issues with PDF/A naming and implementation. The most critical need is reliable open source software for producing and validating PDF/A files.