The File Discovery Tool. Chris Erickson. Brigham Young University. November 29, 2018.
We have created a File Discovery Tool that analyzes directories of objects and prepares a spreadsheet of all the files it discovers for preservation/ingest. This file allows the curators to discover and work with the materials, select those that need to be preserved, and then add collection and other metadata information. The tool fits our workflow, but the source code may be useful for others trying to accomplish a similar task.
We have created a File Discovery Tool that analyzes directories of objects and prepares a spreadsheet of all the files it discovers for preservation/ingest. This file allows the curators to discover and work with the materials, select those that need to be preserved, and then add collection and other metadata information. The tool fits our workflow, but the source code may be useful for others trying to accomplish a similar task.
A sample command to run the tool:
>> java -jar FileDiscovery.jar
[path name of files to check] [output path name for saving the report]>> java -jar C:\FileDiscovery\FileDiscovery.jar "R:\test\objects" C:\output\files
The spreadsheet that is created has the following column
headings:
FILENAME, ITEM ID,
FILEPATH, BYTESIZE, SIZE, COLLECTION, IE_LEVEL, DATE_CREATED, DATE_MODIFIED,
TITLE, CREATOR, DESCRIPTION, RIGHTS_POLICY
Metadata can be added as needed before ingesting the content into Rosetta.
The
files and the metadata can then be submitted to Rosetta using the csv option in
the Rosetta File Harvester tool by adding in a second row of Dublin Core names in order to map the column. A standard template has been created to help in preparing
the file for ingest and is found on the resources page: RosettaFile Ingest template for Excel, or (PDF)
The source is available at https://bitbucket.org/byuhbll/filediscovery