Thursday, June 18, 2015

File identification tools, part 5: JHOVE

JHOVE is a tool that identifies and validates AIFF, GIF, HTML, JPEG, JPEG2000, PDF, TIFF, WAV, XML, ASCII, and UTF-8 files. Unrecognized files are called a “Bytestream.” 

Key concepts in JHOVE are “well-formed” and “valid.” A file which is “well-formed but not valid” has errors, but not ones that should prevent rendering. JHOVE focuses on the semantics of a file rather than its content.  It only reports full conformance to a profile. It won’t tell you why it fell short.

Download JHOVE from GitHub the Open Preservation Foundation; (do not download from SourceForge). Documentation is on the OPF website. A developer's guide is also available:   JHOVE Tips for Developers.

It shouldn't be confused with JHOVE2 which does similar things but has a different code base.

