Data repositories and archives

Digital data repositories, data archives or data centres accept, preserve and disseminate research data, often for a given community. Repositories may be organised by subject (eg structural chemistry data, gene sequence data, social science data) or by organisation such as a research funder.

Research data are typically submitted to the repository by the data creator or owner. The data repository then takes responsibility for preserving the data, managing any access restrictions and making information about the data (metadata) discoverable.

How to find a suitable data repository

A growing number of data repositories and databases are available that archive research data from many subject areas. Unfortunately coverage of different disciplines varies - whilst the social sciences and biosciences are well supported, relatively few data repositories accept engineering data.

To help you find a suitable data repository a number of lists have been compiled:

To archive the data created by projects they support, some funders either run data centres or provide lists of recommended data repositories:

A number of journals support the use of Dryad, Figshare and Zenodo for data underlying scientific and medical literature. Nature's Scientific Data journal also maintains a list of recommended data archives.

How to assess the suitability of an external data repository

There are a number of things to consider when selecting a suitable repository to archive and publish your research data:

  • What type of data does the repository accept and what is its subject focus?
  • Does the data repository already have good reputation in your field and is it recommended by your funder or journal?
  • Will the repository provide enough metadata to enable your data be discovered and cited by other researchers?
  • Will the repository issue your data with a persistent identifier, such as a Digital Object Identifier (DOI) or an accession number, that you can include in your data access statement? A search for archives in re3data allows you to tick a box restricting results to those that provide persistent identifiers.
  • Are access restrictions or embargoes permitted? Will the archive ensure that confidential or personal data are secured if that is required?
  • Do the archive's terms and conditions fit with the University's Intellectual Property policy (intranet link login to access)? For example, does the archive require that you assign any copyright in the data to the archive? We recommend avoiding using archives that require transfer of rights.
  • What licences are available and do they comply with the University's Research Data Management Policy?
  • Is the archive established and well funded so that you can rely on it still preserving your data in 10 years time

If you are considering using an external data archive and require advice on its suitability, please contact [email protected] for advice.