Showing posts with label ETD. Show all posts
Showing posts with label ETD. Show all posts

Saturday, April 15, 2017

ETD+ Toolkit

ETD+ Toolkit. Dr. Katherine Skinner, et al. Educopia Institute. April 10, 2017.
     Very helpful website for dealing with ETDs. The Toolkit is an open set of six modules to help students create, store, and maintain their research outputs. It was designed to:
  • Help administrators understand the digital research outputs students are creating
  • Help administrators assess what to collect and care for as part of the institutional memory
  • Help students make sure that research outputs are in durable formats and on durable devices;
  • Help students make informed decisions about file formats, documentation, and rights.
The Modules, which include "Learning Objectives, a one-page Handout, a Guidance Brief, a Slideshow with full presenter notes, and an evaluation Survey", are:
  1. Copyright: How can students gain appropriate permissions and how can students signal copyright for their own works?
  2. Data Organization: How can students structure, describe, store, and deposit data and other research files for reuse and/or future access?
  3. File Formats: How will the formats students choose make future access to their research easier or more difficult?
  4. Metadata: How can students store information describing their files to make sure they can tell what they are in the future?
  5. Storage: How can students make well informed choices about where to store their research materials?
  6. Version Control: What mechanisms can students use to make it easier to see the history of a file with multiple versions?
"In a 2014 survey of nearly 800 students across nine universities, students reported that non-PDF files - including research data, video, digital art, and software code - are either as important or more important than the Electronic Thesis and Dissertation (ETD) PDF as research outputs and evidence. Fully 80% of these students are producing non-PDF research outputs, most commonly tabular data (43%), digital images (38%), software code (29%), and digital text (28%)."
.
The ETD+ Toolkit provides introductory training for data curation and digital longevity techniques. It helps students identify and offset risks and threats to their digital research.

Saturday, October 08, 2016

Preserving & Curating ETD Research Data & Complex Digital Objects

Preserving & Curating ETD Research Data & Complex Digital Objects. Katherine Skinner, Sam Meister. ETDplus project, Educopia Institute. October 7, 2016.
     The ETDplus project is funded by IMLS and led by the Educopia Institute, in collaboration with many others.  The project helps ensure the longevity and availability of ETD research data and complex digital objects (e.g., software, multimedia files) that are part of student theses and dissertations. The project has just published a set of six Guidance Briefs to help students understand how to prepare, manage, and store the research files associated with their ETDs.

The Guidance Briefs are short “how-to” oriented briefs "designed to help ETD programs build and nurture supportive relationships with student researchers. These briefs are written for a student audience. They are designed to assist student researchers in understanding how their approaches to data and content management impact credibility, replicable research, and general long-term accessibility: knowledge and skills that will impact the health of their careers for years to come."

The Guidance Briefs can be downloaded at the site, and cover the following topics:
1. Copyright
2. Data Structures
3. File Formats
4. Metadata
5. Storage
6. Version Control

Institutions can use the guides as fits their local audiences. Each Brief includes information about the  topic and a “Local Practices” section where an institutions can highlight their own activities.

Tuesday, August 25, 2015

Current Issues and Approaches to Curating Student Research Data

Current Issues and Approaches to Curating Student Research Data. Andrew Creamer. Bulletin of the Association for Information Science and Technology. August 2015. [PDF May require subscription.]
     University libraries have collaborated with their colleges to archive and/or publish the students’ electronic theses and dissertations online. But there is little consistency in how they archive, curate and publish of the students’ research data and digital scholarship that underlies the ETDs. Most institutions have no policy on the caring for the ETD related data. Others have said that “Dissertation datasets represent ‘low-hanging fruit’ for universities who are developing institutional data collections” yet few have addressed the issues. At a recent conference, it was stated that curation of students’ ETD data can be seen as a scale model of the scholarly communication lifecycle and that these are valuable collections that universities should pursue, archive and make available.

One presenter described three organization-level digital curation challenges that libraries need to address:
  1. people not knowing how to do the work,
  2. not enough time or incentive for people to learn and
  3. insufficient resources.
Academic library administrators have an unrealistic expectation that all digital responsibilities and expertise can be put into just one employee.  Aaron Collie at Michigan is working on a three-year strategic plan to create a "collaborative approach to digital curation that will put the policies, people and technologies in place to build organizational capacity." “I think digital preservation is a strategic direction that is too often operationalized as an individual responsibility and skill set.” 

There are important questions to be asked about how to best curate and describe student ETD data.
  • Should there be more oversight over the documentation quality and quantity students provide with their datasets?
  • Should these digital objects receive their own record and metadata?
  • What are the best ways to show the relationships between these objects and the ETD?
  • Can we make the same archival/preservation commitments to supplementary data files that we do for the pdf file of the ETD?
A study of 93 ETDs with related data in OSU’s repository:
  • 45% were Excel files (30% of which had macros, charts and/or linked to other data),
  • 22% were image files and
  • 25% were document files.
  • Of the remaining included text, database and/or statistical software files, of which
  • 23% were code (and 15% of these executable files),
  • 12% of the files were metadata.
  • 30% were unknown, un-operable and/or obsolete; and
  • 3% of the ETDs were missing data files from what was listed among their manifests.
The consensus was that student data collections are worth pursuing and have much value for the public and research enterprise. Libraries interested in this data need to be realize that the content may be more important than previously thought.

Monday, March 30, 2015

Digital Preservation Challenges with an ETD Collection: A Case Study at Texas Tech University

Digital Preservation Challenges with an ETD Collection — A Case Study at Texas Tech University. Joy M. Perrina, Heidi M. Winkler, Le Yanga. The Journal of Academic Librarianship. January, 2015.
The potential risk of loss seems distant and theoretical until it actually happens. The "potential impact of that loss increases exponentially" for a university when the loss is part of the research output. This excellent article looks at a case study of the challenges one university library encountered with its electronic theses and dissertations (ETDs).  Many institutions have been changing from publishing paper theses and dissertations to accepting electronic copies. One of the challenges that has not received as much attention is that of preserving these electronic documents for the long term.  The electronic documents require more hands-on curation.

Texas Tech University encountered difficulties with preserving their ETD collection. They hope the lessons learned from these data losses will help other organizations looking to preserve ETDs and other types of digital files and collections. Some of the losses were:
  1. Loss of metadata edits. Corrupted database and corrupted IT backups required a rebuild of the database, but the entered metadata was lost.
  2. Loss of administrative metadata-embargo periods. The ETD-db files imported into DSpace did not include the embargoed files. Plans were not documented and personnel changed before the problem was discovered. Some items were found accidentally on a personal drive years later.
  3. Loss of scanned files. The scanning server was also the location to store files after scanning. Human error beyond the backup window resulted in the deletion of over a thousand scanned ETDs, which were eventually recovered.
  4. Failure of policies: loss of embargo statuses changes. The embargo statement recorded in the ETD management system did not match what was published in DSpace.
The library started on real digital preservation for the ETD collection. Funds were set aside to increase the storage of the archive space and provide a second copy of the archived files. A digital resources unit was created to handle the digital files which finally brought the entire digital workflow, from scanning to preservation, under one supervisor. The library joined DPN in hopes that it would yield a level of preservation far beyond what the university would be able to accomplish alone. The clean-up of the problems has been difficult and will take years to accomplish. Lessons learned:
  1. Systems designed for managing or publishing documents are not preservation solutions
  2. System backups are not reliable enough to act as a preservation copy. Institutions must make digital preservation plans beyond backups
  3. Organizations with valuable digital assets should invest in their items to store them outside of a display system only. 
  4. Multiple copies of digital items must reside on different servers in order to guarantee that files will not be accidentally deleted or lost through technical difficulties. 
  5. All metadata, including administrative data, should be preserved outside of the display system. The metadata is a crucial part of the digital item.
  6. Digital items are collections of files and metadata.
  7. Maintaining written procedures and documentation for all aspects of digital collections is vital.
  8. The success of digital preservation will require collaboration between curators and the IT people who maintain the software and hardware, and consistent terminology (e.g. archived).
 "Even though this case study has primarily been a description of local issues, the grander lessons gleaned from these crises are not specific to this institution. Librarians are learning and re-learning every day that digital collections cannot be managed in the same fashion as their physical counterparts. These digital collections require more active care over the course of their lifecycles and may require assistance from those outside the traditional library sphere...."