Friday, September 25, 2015

Data Management Practices Across an Institution: Survey and Report

Data Management Practices Across an Institution: Survey and Report. Cunera Buys, Pamela Shaw. Journal of Librarianship and Scholarly Communication. 22 Sep 2015.
     Data management is becoming increasingly important to researchers in all fields. The results of a survey show that both short and long term storage and preservation solutions are needed. When asked, 31% of respondents did not know how much storage they will need, which makes establishing a correctly sized research data storage service difficult. This study presents results from a survey of digital data management practices across all disciplines at a university. In the survey, 65% of faculty said it was important to share data, but less than half of the them "reported that they 'always' or 'frequently' shared their data openly, despite their belief in the importance of sharing".

Researchers produce a wide variety of data types and sizes, but most create no metadata or do not use metadata standards and most researchers were uncertain about how to meet the NSF data management plan requirements (only 45% had a plan). A study in 2011 of data storage and management needs across several academic institutions and found many researchers were satisfied with short-term data storage and management practices, but not satisfied with long-term data storage options. Researchers in the same study did not believe their institutions provided adequate funds, resources, or instruction on good data management practices. When asked about where research data is stored:
  • Sixty-six percent use computer hard drives
  • 47% use external hard drives
  • 50% use departmental or school servers
  • 38% store data on the instrument that generated the data
  • 31% use cloud-based storage services
    •  Dropbox was the most popular service at 63%
  • 27% use flash drives
  • 6% use external data repositories.

Most researchers expected to store raw and published data, “indefinitely”. Many respondents also selected 5-10 years, and very few said they keep data for less than one year. All schools all schools suggest that data are  relevant for long periods of time or indefinitely. Specific retention preferences by school were:
  • The college of arts and sciences prefers “indefinitely” for ALL data types
  • Published data: All schools prefer “indefinitely” for published data except
    • The law school prefers 1-5 years for published data
  • Other data:
    • The school of medicine prefers 5-10 years for all other data types
    • The school of engineering prefers 1-5 years for all other data types
    • The college of arts and sciences “Indefinitely” for raw data
    • The school of management “Indefinitely” for raw data

Keeping raw data / source material was useful since researchers may use it for
  • future / new studies (77 responses), 
  • utilize it for longitudinal studies (9 responses)
  • share it with colleagues (6 responses). 
  • valuable for replicating study results (10 responses), 
  • responding to challenges of published results, 
  • data would be difficult or costly to replicate 
  • simply stated that it is good scientific practice to retain data (4 responses).

When asked, 66% indicated they would need additional storage; most said 1-500 gigabytes or  “don’t know.” Also, when asked what services would be useful in managing research data the top responses were:
  • long term data access and preservation (63%), 
  • services for data storage and backup during active projects(60%), 
  • information regarding data best practices (58%), 
  • information about developing data management plans or other data policies (52%), 
  • assistance with data sharing/management requirements of funding agencies (48%), and 
  • tools for sharing research (48%).
Since most respondents said they planned to keep their data indefinitely, that means that institutional storage solutions would need to accommodate "many data types and uncertain storage capacity needs over long periods of time". The university studied lacks a long term storage solution for large data, but has short term storage available. Since many researchers store data on personal or laboratory computers, laboratory equipment, and USB drives, there is a greater risk of data loss. There appears to be a need to educate researchers on best practices for data storage and backup.

There appears to be a need to educate researchers on external data repositories that are available and on funding agencies’ requirements for data retention. The library decided to provide a clear set of  funder data retention policies linked from the library’s data management web guide. Long-term storage of data is a problem for researchers because of the data and the lack of stable storage solutions and that limits data retention and sharing.

No comments: