Showing posts with label institutional repositories. Show all posts
Showing posts with label institutional repositories. Show all posts

Saturday, August 05, 2017

Elsevier Acquires bepress

Elsevier Acquires bepress. Roger C. Schonfeld.  Society for Scholarly Publishing; The Scholarly Kitchen. Aug 2, 2017.
     Elsevier announces its acquisition of bepress. In a move entirely consistent with its strategy to pivot beyond content licensing to preprints, analytics, workflow, and decision-support, Elsevier is probably the foremost single player in the institutional repository area. There is some concern this acquisition will allow them to co-opt open access. The bepress product, Digital Commons, has more than 500 participating institutions, predominantly US colleges and universities.


bepress Joins Elsevier, with Exciting Potential for Growth. Press release. bepress. Aug 2, 2017.
bepress has joined Elsevier, the largest content provider in the world. The management is "confident that this is the right choice for bepress and for our community. Both parties are committed to sustaining the elements that make bepress bepress, and supporting your open access initiatives."


Tuesday, January 31, 2017

Digital Preservation and Archaeological data

Digital Preservation.  Michael L. Satlow. Then and Now. Jan 26, 2017.
     The post looks at the issue of preservation in relation to the modern scholarly and artistic works. "The underlying problem is a simple one: most scholarly and creative work today is done digitally." Archaeological excavations generate reams of data, and like other scientific data, archaeological data are valuable.  There is no single way that archaeologists record their findings. "Unlike scientists, many archaeologists and humanists have not thought very hard about the preservation of digital data. Scientists routinely deposit their raw data in institutional repositories and are called upon to articulate their digital data management and preservation plan on many grant applications. The paths open to others are less clear."

Institutional digital repositories provide a simple and inexpensive solution. When the project is complete, the data can be converted to xml and deposited. The data conversion would be the most involved part. The xml format would allow the data to be easily accessed and used. "It is time to think about digital preservation as a staple of our 'best practices'.”


Wednesday, October 12, 2016

Q&A with CNI’s Clifford Lynch: Time to re-think the institutional repository?

Q&A with CNI’s Clifford Lynch: Time to re-think the institutional repository?  Richard Poynder. Blog: Open and Shut? September 22, 2016.
     In 1999, a meeting was held to discuss scholarly archives and repositories and ways in which to make them interoperable and to avoid needlessly replicating each other’s content. This led to the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). One notion was that the individual archives "would be given easy-to-implement mechanisms for making information about what they held in their archives externally available".  Open access advocates saw OAI-PMH as a way of aggregating content hosted in local archives, or institutional repositories. This would "encourage universities to create their own repositories and then instruct their researchers to deposit in them copies of all the papers they published in subscription journals."

The interoperability promised by OAI-PMH has not really materialised, and author self-archiving "has remained a minority sport, with researchers reluctant to take on the task of depositing their papers in their institutional repository". Some believe the "IR now faces an existential threat". The interview and additional information are available in a separate PDFThis file looks at whether the IR will survive, be "captured by commercial publishers" or "the research community will finally come together, agree on the appropriate role and purpose of the IR, and then implement a strategic plan that will see repositories filled with the target content."

Friday, November 20, 2015

Developing Best Practices in Digital Library Assessment: Year One Update

Developing Best Practices in Digital Library Assessment: Year One Update. Joyce Chapman, Jody DeRidder, Santi Thompson. D-Lib Magazine. November 2015.
     While research and cultural institutions have increased focus on online access to special collections in the past decade, methods for assessing digital libraries have yet to be standardized. Because of limited resources and increasing demands for online access, assessment has become increasingly important. Library staff do not know how to begin to assess the costs, impact, use, and usability of digital libraries. The Digital Library Federation Assessment Interest Group is working to develop best practices and guidelines in digital library assessment. The definition of a digital library used is "the collections of digitized or digitally born items that are stored, managed, serviced, and preserved by libraries or cultural heritage institutions, excluding the digital content purchased from publishers."

They are considering two basic questions:
  1.     What strategic information do we need to collect to make intelligent decisions?
  2.     How can we best collect, analyze, and share that information effectively?
There are no "standardized criteria for digital library evaluation. Several efforts that are devoted to developing digital library metrics have not produced, as yet, generalizable and accepted metrics, some of which may be used for evaluation. Thus, evaluators have chosen their own evaluation criteria as they went along. As a result, criteria for digital library evaluation fluctuate widely from effort to effort." Not much has changed in the last 10 years in the area in regards to digitized primary source materials and institutional repositories. "Development of best practices and guidelines requires a concerted engagement of the community to whom the outcome matters most: those who develop and support digital libraries". The article shares "what progress we have made to date, as well as to increase awareness of this issue and solicit participation in an evolving effort to develop viable solutions."

Tuesday, August 25, 2015

What is actually happening out there in terms of institutional data repositories?

What is actually happening out there in terms of institutional data repositories?  Ricky Erway. OCLC Research. July 27, 2015.
     Academic libraries are talking about providing data curation services for their researchers.  In most cases they offer just training and advice, but not actual data management services. While technical, preservation, and service issues can be challenging, the funding issues are probably the thing that inhibits this service most. This is an important service that supports the university research mission.

The survey shows of the 22 institutions that answered the survey:
  • stand-alone data repository: 8
  • combination institutional repository and data repository: 12
  • DSpace: 6
  • Hydra/Fedora systems: 6
  • locally developed systems: 4
  • Rosetta, Dataverse, SobekCM, and HUBzero: 1 each
For preservation services:
  • all provide integrity checks except 1
  • keep offsite backup copies: 17
  • provide format migration: 12
  • put master files in a dark archive: 10
For funding:
  • the library’s base budget covered at least some of the expenses: 18
  • the library budget the only source of funding: 7
  • receive fees from researchers: 7
  • receive fees from departments: 4
  • receive institutional funding specifically for data management: 5
  • receive money from the IT budget: 4
  • receive direct funds from grant-funded projects: 1
  • receive indirect funds from grant-funded projects: 1

Saturday, November 22, 2014

Five steps to decide what data to keep

Five steps to decide what data to keep. Angus Whyte. Digital Curation Centre. 31 October 2014.
This guide aims to help UK Higher Education Institutions aid their researchers in making informed choices about what research data to keep.

It will be relevant to researchers making decisions on a project-by-project basis, or formulating departmental guidelines. It assumes that decisions on particular datasets will normally be made by researchers with advice from the appropriate staff (e.g. academic liaison librarians) and taking into account any institutional policy on Research Data Management (RDM) and guidance available within their own domain.

Step 1. Identify purposes that the data could fulfill
Step 2. Identify data that must be kept
Step 3. Identify data that should be kept
Step 4. Weigh up the costs
Step 5. Complete the data appraisal
 
The final step is to weigh the value of the data and any costs still to be incurred, "considering the long-terms aims, the qualities you identified, the time and money already invested in it and the risks of being unable to prepare any ‘must keep’ data for preservation."


  • t be kept
  • Step 3. Identify data that should be kept
  • Step 4. Weigh up the costs
  • Step 5. Complete the data appraisal
  • - See more at: http://www.dcc.ac.uk/resources/how-guides/five-steps-decide-what-data-keep#sthash.bSZLU9IQ.dpuf
    Angus Whyte, Published: 31 October 2014
    Angus Whyte, Published: 31 October 2014
    Angus Whyte, Published: 31 October 2014
    2014
    Angus Whyte, Published: 31 October

    Sunday, May 12, 2013

    ZENODO. Research. Shared.

    ZENODO. Research. Shared. Website. May 12, 2013.
    ZENODO is a new open digital repository repository service that enables researchers, scientists, projects and institutions to share and showcase multidisciplinary research results (data and publications) that are not part of existing institutional or subject-based repositories. The repository is created by OpenAIRE and CERN, and supported by the European Commission.  It promotes peer-reviewed openly accessible research;  all items have a DOI, so they are citable. All formats are allowed. There is a 1GB per file size constraint.  Data files are versioned, but records are not. Files may be deposited under closed, open, embargoed or restricted access.
    It is named after  Zenodotus, the first librarian of the Ancient Library of Alexandria and father of the first recorded use of metadata, a landmark in library history. ZENODO is provided free of charge for educational and informational use.


    Saturday, March 23, 2013

    Adding Value to Electronic Theses and Dissertations in Institutional Repositories

    Adding Value to Electronic Theses and Dissertations in Institutional Repositories. Joachim Schöpfel. D-Lib Magazine. March/April 2013.

    This paper looks at the differences with institutional repositories that contain electronic theses and dissertations (ETDs, particularly regarding metadata, policy, access restrictions, representativeness, file format, status, quality and related services. The intent is to improve the  "quality of content and service provision in an open environment, in order to increase impact, traffic and usage". This paper shows five ways in which institutions can add value to the deposit and dissemination of electronic theses and dissertation:
    1. Quality of content. A good IR not only defines a set of standards and criteria for the selection and validation of deposits but also communicates and promotes this editorial policy.
    2. Metadata. The description of the content and context of the ETD files will make a difference. 
    3. Format. The IR should contain full text, offer different file formats, and have deposit formats are searchable, open, and appropriate for long-term preservation and use of the content.
    4. Repositories should network and interconnect.
    5. Provide needed services beyond basic searching, viewing, and downloading. Some possibilities are discussion forums, usage statistics and metrics, citations, Print On Demand in book format, copyright protection or Creative Commons licensing, and preservation. 
     Institutional Repositories must also be future-oriented and anticipate future transformation of scientific communication. "It is crucial for the success of a repository that the institution clearly defines its objectives in line with its scientific strategy and environment. "

    Friday, May 18, 2012

    The CLIF Project: The Repository as Part of a Content Lifecycle

    The CLIF Project: The Repository as Part of a Content Lifecycle. Richard Green, Chris Awre, Simon Waddington. Ariadne. 9 March 2012.
    This was a joint project that did an extensive literature review and worked with  digital content creators to understand how to deal with the interaction of the authoring, collaboration and delivery of materials. At the heart of meeting institutional requirements for managing digital content is the need to understand the different operations through which content goes, from planning and creation through to disposal or preservation. Repositories must be integrated with the other systems that support other parts of this lifecycle to prevent them becoming yet another information silo within the institution.

    The CLIF software has been designed to try and allow the maximum flexibility in how and when users can transfer material from one system to another, integrating the tools in such a way that they seem to be natural extensions of the basic systems.  This open source software is available for others to investigate and use.

    The repository’s archival capability is regarded as one of its strongest assets, and the role of the repository within a University will be regarded very much in terms of what it can offer that other campus systems cannot.  It should not try to compete on all levels. There is a need to clarify better at an institutional level what functionality is offered by different content management systems, in order to better understand how different stages of the digital content lifecycle can be best enabled.


    Saturday, October 22, 2011

    Cite Datasets and Link to Publications

    Cite Datasets and Link to Publications. Digital Curation Centre. 18 October 2011.
    The DCC has published a guide to help authors / researchers create links between their academic publications and the underlying datasets.  It is important for those reading the publication to be able to locate the dataset.  This recognizes that data generated during research are just as valuable to the ongoing academic discourse as papers and monographs, and in many cases the data needs to be shared. "Ultimately, bibliographic links between datasets and papers are a necessary step if the culture of the scientific and research community as a whole is to shift towards data sharing, increasing the rapidity and transparency with which science advances."

    This guide has identified a set of requirements for dataset citations and any services set up to support them. Citations must be able to uniquely identify the object cited, identify the whole dataset and subsets as well.  The citation must be able to be used by people and software tools alike.  There are a number of elements needed, but the "most important of these elements – the ones that should be present in any citation – are the author, the title and date, and the location. These give due credit, allow the reader to judge the relevance of the data, and permit access the data, respectively."  A persistent url is needed, and there are several types that can be used. 

    Monday, October 17, 2011

    Research Librarians Consider the Risks and Rewards of Collaboration.

    Research Librarians Consider the Risks and Rewards of Collaboration. Jennifer Howard. The Chronicle of Higher Education. October 16, 2011.

    Association of Research Libraries’ meeting discussed research and preservation projects like the HathiTrust digital repository and the proposed Digital Public Library of America, plans for which are moving ahead. Concerning the Digital Public Library of America: “Library” is a misnomer in this case, which is more of a federation of existing objects. It wouldn’t own anything. The main contribution would be to set standards and link resources.  “The user has to drive this.”

    They said that it’s almost three times more expensive to store materials locally than it is to store them with HathiTrust. Researchers now also create and share digital resources themselves via social-publishing sites such as Scribd. There is a need for collection-level tools that allow scholars and curators to see beyond catalog records.

    Discussed Recollection, a free platform built by NDIIPP and a company named Zepheira to give a better “collection-level view” of libraries’ holdings. The platform can be used to build interactive maps, timelines, and other interfaces from descriptive metadata and other information in library catalogs. So, for instance, plain-text place names on a spreadsheet can be turned into points of latitude and longitude and plotted on a map.

    “Rebalancing the Investment in Collections,” discussed that libraries had painted themselves into a corner by focusing too much on their collection budgets. Investing in the right skills and partnerships is most critical now. “The comprehensive and well-crafted collection is no longer an end in itself.”

    On person told librarians that they shouldn’t rush to be the first to digitize everything and invest in every new technology. “Everybody underestimates the cost of innovation,” he said. “Instead of rushing in and participating in a game where you don’t have the muscle, you want to stand back” and wait for the right moment.

    Digital Preservation Matters.

    Sunday, September 04, 2011

    Institutional Repository and ETD Bibliography 2011

    Institutional Repository and ETD Bibliography 2011. Charles W. Bailey, Jr.  September 2011.
    This bibliography has over 600 English-language articles, books, and other works about institutional repositories and theses and dissertations (ETDs).  Among other things, it includes digital preservation issues, IR library issues, IR metadata strategies, and institutional open access mandates and policies. Most sources have been published from 2000 through June 30, 2011.  The bibliography includes links to freely available versions of included works.  It is available as a PDF file.

    Monday, August 29, 2011

    Institutional Repositories and Digital Preservation: Assessing Current Practices at Research Libraries

    Institutional Repositories and Digital Preservation: Assessing Current Practices at Research Libraries. Yuan Li, Meghan Banach. D-Lib Magazine. May/June 2011.
    If the digital scholarly record is to be preserved, libraries need to establish new best practices for preservation. For their part, creators need to be more proactive about archiving their work. Institutional Repositories may provide some help in preserving digitial materials, but some  question whether IRs were intended to provide long-term preservation of digital scholarship.  


    The most important roles that IRs play are to collect, manage, and disseminate the digital scholarship that their communities produce. Most content in an IR is deposited by author self-archiving, by third party on behalf of the author, and by repository staff. Regardless of how content is deposited in the IR, the quality of deposited content should be examined before digital preservation actions are considered, since the quality of content can directly affect the success of digital preservation efforts. Problems may include format obsolescence, poor quality images, and insufficient metadata to manage and preserve the materials.

    While most report that their IRs are currently providing long-term digital preservation, a closer look shows they are really in a planning process to provide long-term preservation rather than providing it in a fully operational way. An increasing number of research libraries have started to move digital preservation programs ahead by developing preservation policies.

    Thursday, August 11, 2011

    Building a Sustainable Institutional Repository

    Building a Sustainable Institutional Repository. Chenying Li, et al. D-Lib Magazine. July/August 2011.
    Institutional Repositories are an increasingly important resource and service offered by libraries. Increasing the use of the content is a key to building a sustainable IR. Two organizational types:

    1. Structured Content Organization
    Organizing content according to its role in the University, which provides a more orderly process of content organization and more efficient metadata.

    2. Modular Content Publishing
    Creating modules as independent publishing units that work together as a complete and comprehensive publishing system. This uses themed publishing and metadata aggregation.

    It is becoming more important for libraries to provide users with the contents and services that are found in institutional repositories.  

    Friday, April 30, 2010

    Digital Preservation Matters - April 30, 2010

    Digital Preservation: An Unsolved Problem. Jonathan Shaw. Harvard Magazine. April 27, 2010.

    With the advantages of digital, why do libraries not embrace the digital future now? One of the main obstacles is the issue of preservation. For books: "the greatest risks to printed material are the environment, wear and tear, security, and custodial neglect." For digital: using data is one of the best ways to preserve it because you know it is usable; digital data must be read and checked constantly to ensure integrity. Another concern about digital is that current formats may not be readable in the future (reference to June 2009 New Yorker cover). Born digital materials are not as easy to save since they have many different formats. This is difficult for librarians keeping records of the university's intellectual life, because of both the legal and digital challenges. "We are in a period of unprecedented lack of documentation of academic output."

    ---

    Gutenberg 2.0. Harvard's libraries deal with disruptive change. Jonathan Shaw. Harvard Magazine. April 27, 2010.

    In the scientific disciplines, information, from online journals to databases, must be recent to be relevant. Books in libraries to some seem more like a museum. Some think that massive digital projects will make research libraries irrelevant. The future of libraries is clearly digital. "Yet if the format of the future is digital, the content remains data. And at its simplest, scholarship in any discipline is about gaining access to information and knowledge." Access to the information will mean different things and be done in different ways. In the meantime, "Who has the most scientific knowledge of large-scale organization, collection, and access to information? Librarians."

    How do we deal with large scale collections and the access to the information? "We ought to be leveraging that expertise to deal with this new digital environment. That's a vision of librarians as specialists in organizing and accessing and preserving information in multiple media forms, rather than as curators of collections of books, maps, or posters." The role of libraries isn't going away, but it is changing.

    The idea that libraries will be stewards of vast data collections raises very serious concerns about the long-term preservation of digital materials. The worry is that the longevity of the resources has not been tested. There are 3 copies of the 109 TB Harvard repository. It is in a constant process of checking and refreshing to make sure everything is readable.

    ---

    The Floppy is Dead: Time to Move Memories to the Cloud. Lance Ulanoff. PC Magazine. Apr 26, 2010.

    The decision by Sony to stop producing 3.5-inch disks marks an end to that format. The end of any popular format can have a ripple effect on the technology world. If the data is not migrated to later formats it could "trapped on its obsolete format". All media will become obsolete sometime, it is the natural progression of technology. Since change is inevitable the article suggests everyone consider cloud-based backup storage options. It suggests that this is better than storing data on eventually-to-be-obsolete media.

    ---

    Google is not the last word in information. Lia Timson. Sydney Morning Herald. April 29, 2010.

    Interesting article concerning primary and secondary sources, what is on the internet and how it gets there, special collections, etc.

    • "Better still is the lesson and the realisation that information and history don't just appear on Google. Someone has to publish it onto the web, put it there in the first place."
    • "As educators we must ask that assignment bibliographies include more than just "three websites". We must insist on a variety of media as sources, including interviews with real people, be they witnesses, historians or surviving relatives, and even insist on trips to the local library."
    • … researching is much wider and deeper than searching online.

    ---

    A Gentle Reminder to Special-Collections Curators. Todd Gilman. The Chronicle of Higher Education. April 29, 2010.

    Article and a librarian's experience trying to use special collections. The "job is not to keep readers from your books but just the opposite: to facilitate readers' use of the collections."

    ---

    Friday, April 23, 2010

    Digital Preservation Matters - April 23, 2010

    National Archives Reports on Federal Agency Records Management Programs. NARA Press Release. April 19, 2010.

    NARA issued a mandatory records management self-assessment to 245 Federal cabinet-level agencies and related groups, and 91% responded. The goal was to determine how effective Federal agencies are in meeting the statutory and regulatory requirements for records management. The study showed that 79% of agencies are falling short in their responsibilities. The long-term success of the Open Government initiative and the ability to ensure access to the records of our government, hinges on the ability of each Federal agency to effectively manage its records.

    View the 93 page report.

    ---

    Library of Congress Digital Preservation Newsletter. Library of Congress. April 2010.

    The newsletter includes information about a number of digital preservation initiatives. Some of them are:

    • A new video "Why Digital Preservation is Important for Everyone" which also includes a transcript. The main theme is that digital materials, which can fail or be lost, require active management. The three minute video is worth watching.
    • The Federal Agencies Digitization Guidelines Initiative is helping government agencies preserve audio-visual information.
    • Links to The Blue Ribbon Task Force on Sustainable Digital Preservation and Access and their recent report, Sustainable Economics for a Digital Planet: Ensuring Long-Term Access to Digi­tal Information.
    • Link to a podcast "Conversations about Digital Preservation" about the Library's challenges to build an efficient, scalable digital repository, how the Library's repository works and future plans for the repository
    • A group of institutions have developed an automated way to preserve official e-mail records produced by Microsoft Outlook and capture the necessary long-term preservation metadata. This is part of the Persistent Digital Archives and Library System project (PeDALS) to develop a shared curatorial framework for preserving digital public records across multiple states.
    • May 10th will be the Personal Archiving Day at the Library of Congress.

    ---

    NEW Blog from the DuraSpace Preservation & Archiving Solution Community. Carol Minton Morris. DuraSpace Preservation & Archiving. April 21, 2010.

    A new blog has been set up by the Preservation and Archiving Solution Community. The blog is a vehicle for an open exchange of ideas and initiatives around preservation & archiving solutions. All are welcome to participate. It had started as a group using Fedora Commons, but is actually looking at all preservation issues, not just those for Fedora or DSpace.

    ---

    Digital Preservation and the Challenge. Ron Jantz. DuraSpace Preservation & Archiving. April 21, 2010.

    Institutions around the world are grappling with the technology, processes, and organizational structures that will result in digital preservation becoming a reality. The challenge to preserve information goes back centuries to those trying to preserve books in the past, and mentions a example when the Reformation dissolved the monasteries, and the books were not preserved. Can we demonstrate that we are preserving what we have now? We should be looking at self assessment tools to see how we are doing with preservation.

    ---

    Crowdsourcing: How and Why Should Libraries Do It? Rose Holley. D-Lib Magazine. March/April 2010.

    Crowdsourcing is a new term referring to undefined groups of people in a community "taking tasks traditionally performed by an employee or contractor and outsourcing it to a group (crowd) of people or community in the form of an open call." It may be the "most useful tool a library can have in the future." The work can be done as a group or as an individual. Libraries already know about the first step of crowdsourcing: social engagement with individuals, but need to improve in the second step: defining and working towards group goals. This can bring benefits to libraries and users, especially by adding value to data by adding comments, tags, ratings, reviews. Some successful examples include collections at the National Library of Australia, FamilySearchIndexing and Latter Day Saints: Text transcription of records, Wikipedia, etc. These released their services 'quietly' with little or no advertising, but clear group goals. The article looks at the volunteer profile, motivational factors, types of acknowledgement and rewards, managing volunteers, and tips for successful crowdsourcing. "Freedom is actually a bigger game than power. Power is about what you can control. Freedom is about what you can unleash".

    Some of the tips:

    1. Have a transparent and clear goal on your home page
    2. Have a transparent and visible chart of progress towards your goal.
    3. Make the overall environment easy to use, intuitive, quick and reliable.
    4. Make the activity easy and fun; it must be interesting.
    5. Keep the site active by addition of new content/work.
    6. Give volunteers options and choices
    7. Make the results/outcome of your work transparent and visible.
    8. Let volunteers identify and make themselves visible if they want acknowledgement.
    9. Reward high achievers by having ranking tables and encourage competition.
    10. Give the volunteers an online team/communication environment to build a dynamic, supportive team environment.
    11. Treat your 'super' volunteers with respect and listen to them carefully.
    12. Assume volunteers will do it right rather than wrong.

    ---

    Friday, April 16, 2010

    Digital Preservation Matters - April 16, 2010

    State Of America's Libraries Report 2010. American Library Association. April 11, 2010.

    Interesting report about libraries. As the recession continues, Americans turn to libraries in ever larger numbers for access to resources for employment, continuing education, and government services. The local library has become a lifeline of resources, training and workshops. Even in the age of Google, academic libraries are being used more than ever. During a typical week in fiscal 2008, academic libraries in the United States had more than 20.3 million visits, answered more than 1.1 million reference questions, and made more than 498,000 presentations to groups attended by more than 8.9 million students and faculty, increases over the previous years. Over 43% of libraries provide access to locally produced digitized collections.

    ---

    A National Conversation on the Economic Sustainability of Digital Information. Blue Ribbon Task Force on Sustainable Digital Preservation and Access. April 1, 2010. [Silverlight video.]

    This page has the agenda and video presentations from A National Conversation on the Economic Sustainability of Digital Information, a recent meeting hosted by the Blue Ribbon Task Force on Sustainable Digital Preservation and Access.

    BRTF's Featured Agenda and Presentations:

    • Research Data, Daniel E. Atkins, Wayne Clough,
    • Scholarly Discourse, Derek Law, Brian Schottlaender,
    • Economics of Collectively-Created Content, George Oates, Timo Hannay
    • Commercially-owned Cultural Content, Chris Lacinak, Jon Landau
    • Economics of Digital Information, William G. Bowen, Hal R. Varian, Dan Rubinfeld
    • Summary by Clifford Lynch.

    ---

    How Tweet It Is!: Library Acquires Entire Twitter Archive. Matt Raymond. Blog. Library of Congress. April 14, 2010.

    The Library of Congress is digitally archiving every public tweet made since Twitter started in 2006. "Expect to see an emphasis on the scholarly and research implications of the acquisition." Amazing to think what we can "learn about ourselves and the world around us from this wealth of data. And I'm certain we'll learn things that none of us now can even possibly conceive." The Library of Congress has been archiving information from the web since 2000. It now has more than 167 terabytes of web-based information, including legal blogs and political websites.

    ---

    Library of Congress: We're archiving every tweet ever made. Nate Anderson. Ars Technica. April 16, 2010.

    Comments about the Library of Congress archiving tweets:

    • There's been a turn toward historicism in academic circles over the last few decades, a turn that emphasizes not just official histories and novels but the diaries of women who never wrote for publication, or the oral histories of soldiers from the Civil War, or the letters written by a sawmill owner. The idea is to better understand the context of a time and place, to understand the way that all kinds of people thought and lived, and to get away from an older scholarship that privileged the productions of (usually) elite males."
    • Digital technologies pose a problem for the Library and other archival institutions, though. By making data so easy to generate and then record, they push archives to think hard about their missions and adapt to new technical challenges."

    ---

    Aligning Investments with the Digital Evolution: Results of 2009 Faculty Survey Released. Roger C. Schonfeld, Ross Housewright. Ithaka. April 07, 2010. [37p. PDF]

    An excellent report for academic libraries especially, Faculty Survey 2009: Strategic Insights for Librarians, Publishers, and Societies, that looks at faculty attitudes towards the academic library, information resources, and the scholarly communications system. A few quotes from the report:

    • Faculty most often turn to network-level services, including both general purpose search engines and services targeted specifically to academia.
    • Of all disciplines, scientists remain the least likely to utilize library-specific starting points;
    • Network-level services are increasingly important for discovery, not only of monographs and journals but archival resources and other primary source collections.
    • The library must evolve to meet these changing needs.
    • 90% of faculty members view the library buyer role as very important, 71% and 59% now view the archive and gateway roles as very important, respectively. Archiving is the 2nd highest role.
    • Despite the reported declines in importance of all the library's roles other than as a buyer, the 2009 study saw a slight rise in perceived dependence on the library
    • The declining visibility and importance of traditional roles for the library and the librarian may lead to faculty primarily perceiving the library as a budget line, rather than as an active intellectual partner.
    • Faculty members most strongly support and appreciate the library's infrastructural roles, in which it acquires and maintains collections of materials on their behalf.
    • Faculty members sense of the significance of long-term preservation of electronic journals has steadily increased over time
    • Effective and sustainable models for the preservation of electronic journals must be developed
    • Scholars, regardless of field, indicate a general preference that digital materials be preserved.
    • Less than 30% of faculty members have deposited any scholarly material into a repository; nearly 50% have not deposited but hope to do so in the future
    • Faculty attitudes and practices are at the strategic core. Greater engagement with and support of trailblazing faculty disciplines may help develop the roles and services to serve faculty needs into the future. The institutions that serve faculty must also anticipate them, both to ensure that the 21st century information needs of faculty are met and to secure their own relevance for the future.

    Friday, April 09, 2010

    Digital Preservation Matters - April 9, 2010

    Blu-ray Disc Association Announces Additional Format Enhancements. Press Release. April 3, 2010.

    The Blu-ray Disc Association announced two new media specifications:

    • The BDXL specification, targeted at broadcasting, medical and document imaging needs, has write-once discs of 100GB and 128GB capacity, and rewritable capability on 100GB discs. The discs use three to four recordable layers. A consumer version of BDXL is also expected sometime.
    • The Intra-Hybrid Blu-ray Disc has both a 25GB read only layer and 25GB rewritable layer and a single BD-RE layer so both needs can be met with one disc.

    The two new types of discs require newly-designed hardware to record and play back.

    ---

    Effort Will Help Libraries Put Academic Papers in Data 'Cloud'. Jeff Young. April 5, 2010

    Some librarians are hoping that cloud computing will help their efforts to build institutional repositories, university wide collections of research papers. A new project sponsored by DuraSpace (a merger of DSpace and Fedora Commons) is called DuraCloud. This project plans to make it easier for librarians to put their repositories in off-site data storage. "A key design feature of DuraCloud is to leave the basics of pure storage to those who do it best (storage providers)." The project is now in the pilot phase, but should be available by the fall of 2010. "The biggest draw of the approach: It can be much cheaper than building new data centers to run on campuses.”

    ---

    Submission Policy Recommendations. Chris Prom. Practical E-Records. March 24, 2010.

    Here are some great policy documents that are an essential first step toward creating an active digital preservation plan. There are links on this page to several documents:

    • E-Records Deposit Policy
    • Preservation/Access Plan
    • Transfer Guidelines
    • E-record Survey Form
    • Submission Agreement Form

    There is also a link to the do-it yourself TDR (Trusted Digital Repository). The preservation access plan is especially helpful because it looks at supported formats, both access and preservation formats, access tools for the formats, and migration path.

    ---

    iPRES 2009: the Sixth International Conference on Preservation of Digital Objects. University of California. March 30, 2010.

    The proceedings and videos from iPRES 2009 (held in San Francisco on Oct 5-6 2009) are now available online. The proceedings are available through the California Digital Library’s eScholarship site. The conference program, presentations, and videos are available at this link. There are many excellent resources here.

    ---

    Tuesday, March 02, 2010

    Digital Preservation Matters - March 2, 2010

    A Guide to Distributed Digital Preservation. Katherine Skinner, Matt Schultz. Educopia Institute. February 2010. [156 p. PDF]

    Excellent guide created by MetaArchive, who developed the first private LOCKSS network in 2004. This work examines distributed digital preservation, successful strategies and new models . It will help others to join or establish a private LOCKSS network. It discusses the network architecture, technical and organization considerations, content selection and ingest, administration and copyright practices in the network. A distributed digital preservation system must preserve, not just back-up. The preservation process of contributing, preserving, and retrieving content depends upon the institution’s diligence. Ingested content is preserved not just through replication, but by the caches through a set of polling, voting, and repairing processes. Distributed digital preservation, by definition, requires communication and collaboration across multiple locations and between numerous staff.

    The software provides bit-level preservation for digital objects of any file type or format, but it can also provide a set of services to make the preserved files usable in the future, such as normalizing and migrating. The MetaArchive network is a dark archive with no public interface; communication between caches is secure. Organizations collaborating on preserving digital content must examine the roles and responsibilities of members, address essential management, policy, and staffing questions, develop standards, and define the network’s sphere of activity. Ingest, monitoring, and recovery of content are critical steps for preserving the content.

    Some interesting quotes from the guide:

    • Paradoxically, there is simultaneously far greater potential risk and far greater potential security for digital collections
    • many cultural memory organizations are today seeking third parties to take on the responsibility for acquiring and managing their digital collections. The same institutions would never consider outsourcing management and custodianship of their print and artifact collections;
    • A great deal of content is in fact routinely lost by cultural memory organizations as they struggle with the enormous spectrum of issues required to preserve digital collections,
    • A true digital preservation program will require multi-institutional collaboration and at least some ongoing investment to realistically address the issues involved in preserving information over time.
    • One of the greatest risks we run in not preserving our own digital assets for ourselves is that we simultaneously cease to preserve our own viability as institutions.

    ---

    Encouraging Open Access. Steve Kolowich. Inside Higher Ed. March 2, 2010.

    Conversations about open access to journal articles currently revolve around policy, not technology; about if the content should be made available, not how. “Without content, an IR is just a set of empty shelves.” A new model of repository focuses on giving researchers an online “workspace” within the repository where they can upload and preserve different versions of an article they are working on. The idea is to make publishing articles to the open repository a natural extension of the creative process. This is based on a survey where professors wanted:

    • to be able to work with co-authors easily,
    • to keep track of different versions of the same document, and
    • to make their work more visible
    • all while doing as little extra work as possible.

    ---

    In the digital age, librarians are pioneers. Judy Bolton-Fasman. The Boston Globe. February 10, 2010.

    Book review of This Book Is Overdue: How Librarians and Cybrarians Can Save Us All By Marilyn Johnson.

    • Among information professionals, Johnson notes there are librarians and archivists: “Librarians were finders [of information]. Archivists were keepers.’’ But the information revolution is affecting both.
    • The digital age is making possible the creation of searchable databases of archives, but it’s also making information, especially on the Internet, more ephemeral and harder to collect.
    • Information archivists “capturing history before it disappears because of a broken link or outdated software.”
    • in a world where technology moves life at a breathtaking pace, “where information itself is a free-for-all, with traditional news sources going bankrupt and publishers in trouble, we need librarians more than ever’’ to help point the way to the best, most reliable sources.

    ---

    Installing OAIS Software: Archivematica. Chris Prom. Practical E-Records. February 1, 2010.

    One of several reports on open source tools the blog author is evaluating to help with ingest, storage, and access processes in archives. This post looks at Archivematica, and he likes the supportable model for facilitating archival work with electronic records. It is a Ubuntu-based virtual appliance which can exist alongside preservation tools on other systems. It can be installed locally and in a variety of ways. Worth looking in to.

    ---

    IBM announces massive NAS array for the cloud. Lucas Mearian. Computerworld. February 11, 2010.

    IBM has announced SONAS, an enterprise-class network-attached storage array capable of scaling from 27TB to 14 petabytes under a single name space. It is designed to provide access to data anywhere any time. The policy-driven automation storage software allows an institution to predefine where data is placed, when it is created, where and when it moves to in the storage hierarchy, where it's copied for disaster recovery, and when it will be eventually deleted.

    Monday, February 22, 2010

    Digital Preservation Matters - February 22, 2010

    Appraisal Actions and Decisions. Chris Prom. Practical E-Records. February 15, 2010.

    Most development work on digital repositories focuses on the requirements of the OAIS reference model. But OAIS doesn’t say how records should be selected for deposit. While each archive has a different focus, selecting records for inclusion in an archive is heavily debated. The appraisal process requires careful and intelligent decision making by a person. When appraising electronic records, several tools are needed:

    • examine, identify, compare, delete, rename, and reorganize records
    • manage information concerning records surveys/assessments.
    • manage submission agreements
    • ensure that appraisal actions are documented.

    A set of tools is needed to examine, characterize, delete and possibly, reorder records quickly. This would make it easier to decide if the records are within the scope of the archives policy, then take appropriate actions concerning them.

    ---

    E-Library Economics. Steve Kolowich. Inside Higher Ed. February 10, 2010.

    Two studies from the Council on Library and Information Resources examine the implications of libraries changing to digital collections. Libraries seem to be headed in the direction of primarily digital infrastructures but the journey is slow going. Digital standards, such as those for eBooks, are still changing. “While they enjoy the searchability of electronic documents and databases, academics still prefer holding a book in their hands to read it.” The studies point to an average of $4.26 per book per year to keep the book on the shelf. The cost for digital is much less; the digital media repository Hathi Trust stores five million copies at between $0.15 and $0.40 per volume, per year. Books in high-density storage facilities cost only $0.86 per year to keep in usable condition. “The administrators who provide library budgets may be reluctant to fund new facilities to house print collections and may question large expenditures to support both print and electronic formats. Library directors must consider not only the immediate expectations of faculty, but also the long-term goals for the library.”

    ---

    Studies Cite Argument for, Resistance to Increased Digital Library Collections. Library Journal. February 11, 2010.

    A reaction to the E-Library Economics article. The keys to success are to communicate with and educate the students and faculty why the changes are important; to emphasize the preservation of resources, security, and the benefits; and to make the electronic resources available without barriers. One concern, the “move to electronic collections requires certainty about access to digital collections and their persistence. Also, removing books would not change the fixed costs of the building. The report authors also acknowledge “that the business model for ebooks remains unsettled and that print plays an important role for resources that don't yet work so well in digital format."

    ---

    Using DROID for Appraisal. Chris Prom. Practical E-Records. February 17, 2010.

    DROID is a tool to help archivists identify file formats. But it may be valuable in the appraisal process to help an archivist understand the components of a records series. By running DROID and analyzing the reports, it is possible to identify particular file formats outside of the proposed collection scope, especially useful if they are deep in a directory structure. Specific examples and processes used are outlined.

    ---

    Film Institute launches first digital archive in Wales. BBC News. 9 February 2010.

    The British Film Institute has launched its first "digital jukebox" in Wales, allowing people to access its archive. The Mediatheque is already available in England. The system allows people to watch films and TV programs, currently 1,500 titles, from the national archive free of charge; 85% of the titles had not been released on DVD or online.

    ---

    Innovation: We can't look after our data – what can? Tom Simonite . New Scientist. 11 February 2010.

    Anyone worried about the fragility of digital data and civilization’s chances to survive would do well to look to their own data stores first. “Most of us today are blithely heading for our own personal data disasters” because of benign neglect. Data is often lost more from disorganization than from a technological catastrophe, though that happens too. Two possible approaches are mentioned: the Self Archiving Legacy Toolkit (SALT); and the Pergamum project. We are in need of tools to help with diverse, disorganized digital archives which are becoming the norm.

    ---

    Court Finds E-Mails Stored on Old Archiving System Reasonably Accessible; Costs Exaggerated. Kroll Ontrack. Recent ESI Court Decisions. February 2010.

    A court case where the defendant argued that e-mails archived on the company's "cumbersome" old system were not reasonably accessible. “The court found that the plaintiff should not be disadvantaged since the defendant, a "sophisticated" company, chose not to migrate the e-mails to the now-functional archival system and thus determined that the e-mails were reasonably accessible.”