Repository managers often create a smaller set of formats to simplify management; the formats vary by institutions. Many institutions have a migration strategy to migrate digital objects from the great multiplicity of formats used to create digital materials to a smaller, more manageable number of standard formats that can still encode the complexity of structure and form of the original.
Open file formats are generally preferred to closed, proprietary formats because the way they encode content is transparent. On the other hand, adoption of a proprietary file format by a broad community of content creators, disseminators and users, is often considered a reliable indicator of that format's longevity. Additional qualities such as complexity, the presence of digital rights management controls, and external dependencies are also seen as relevant factors to consider when assessing file formats for preservation. There is, however, no fail-safe formula for file format policy decisions. Here are some of the formats that are most mentioned in preservation policies:
The five most commonly occurring file formats in all policies:
- Tagged Image File Format (extension TIFF, or TIF) (115),
- Waveform Audio File Format (WAV) (80),
- Portable Document Format (PDF) (74),
- JPEG (JPG, JPEG) (70), and
- Plain text document (TXT, ASC) (69).
- Tagged Image File Format (TIFF, TIF) (88),
- Plain text document (TXT, ASC) (52),
- Portable Document Format (PDF) (49),
- Waveform Audio File Format (WAV) (47), and
- Extensible Markup Language (XML) (47).
- Quicktime (MOV, QT) (47),
- Microsoft Excel (XLS) (39),
- Microsoft Word (DOC) (38),
- Microsoft Powerpoint (PPT) (38), and
- RealAudio (RAM, RA, RM) (35).