Data access statements, also known as data availability statements, are used in publications to describe where data directly supporting the publication can be found and under what conditions they can be accessed.
Data access statements are required for all publications arising from publicly-funded research. They are a requirement of many funders' data policies and are a requirement of the RCUK Policy on Open Access. Inclusion of a data access statement is recommended for publications reporting other research.
Some funders have indicated that they now check for the inclusion of data access statements in publications that acknowledge their support. In particular, the requirement applies to all papers that acknowledge EPSRC funding with a publication date after 1 May 2015.
The aim of the data access statement is discoverability – the data referenced by the statement do not have to be openly available. There are many reasons why access to data should be restricted and if you are unsure about whether you should publish your data openly please contact [email protected] for advice.
Where to provide the data access statement
Some journals now provide a separate section in articles for the data access statement. Where no such section exist, we suggest that you include the data access statement with the acknowledgement of funder support.
A formal data citation can also be included either with the main references or in a specified data citation section.
If you're unsure where to provide your data access statement please contact contact [email protected] for help.
What to include in the data access statement
The following are recommendations for what to include in the data access statement:
- If data are openly available the name(s) of the data repositories should be provided, as well as any persistent identifiers or accession numbers for the dataset.
- If there are justifiable legal or ethical reasons why your data cannot be made available, these should be included in the data access statement.
- If the data themselves are not openly available, the data access statement should direct users to a permanent record that describes any access constraints or conditions that must be satisfied for access to be granted.
- It is important that any links to the data are persistent. Digital Object Identifiers are a type of persistent URL that are provided for datasets by many specialist data archives.
- If you did not collect the research data yourself but instead used existing data obtained from another source, this source should be credited.
A simple direction to interested parties to contact the author would not normally be considered sufficient.
The data access statement should be included in submitted manuscripts, even if identifiers have not yet been issued. The statement should be updated to include any persistent identifiers or accession numbers as they become available, typically when the manuscript is accepted for publication.
Data access statements can also be combined with formal data citations, particularly where a publication is supported by multiple datasets archived in different locations. In this situation it may be more appropriate to cite each dataset separately, providing the persistent identifier in the citation, and direct users to the references from the data access statement. An example statement is given below and DataCite provides examples of data citations.
Example data access statements
Please note that to prevent creating bias in metrics monitoring DOI resolutions, the URLs used in these examples are not genuine.
Depending on the nature of your data you may wish to combine information from different examples. Please contact [email protected] for help with structuring your data access statement.
Openly available data
"All data created during this research are openly available from [add in appropriate data archive eg Figshare] at http://doi.org/10.15125/12345."
"All data supporting this study are provided as supplementary information accompanying this paper."
"All data are provided in full in the results section of this paper." "Expression data are openly available from ArrayExpress (Accession E-MTAB-01234 at https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-01234/). Crystal structures are available from the Cambridge Crystallographic Data Centre (Identifier BATHRS) at http://doi.org/10.15125/010203. Microscopy images are openly available from Dryad at http://doi.org/10.15125/01234."
Citation of multiple datasets
"This publication is supported by multiple datasets, which are openly available at locations cited in the reference section."
Secondary analysis of existing data
"This study was a re-analysis of existing data that are publicly available from EMBL at http://doi.org/10.15125/12345. Further documentation about data processing are available from the University of Bath data archive at http://doi.org/10.15125/12345."
"The study brought together existing data obtained upon request and subject to licence restrictions from a number of different sources. Full details how these data were obtained are available in the documentation available at http://doi.org/10.15125/12345."
"Anonymised interview transcripts from participants who consented to data sharing, plus other supporting information, are available from the UK Data Service, subject to registration, at http://doi.org/10.15125/12345."
"Due to ethical concerns, supporting data cannot be made openly available. Further information about the data and conditions for access are available at the: http://doi.org/10.15125/12345."
"Due to the (commercially, politically, ethically) sensitive nature of the research, no interviewees consented to their data being retained or shared. Additional details relating to other aspects of the data are available from the at http://doi.org/10.15125/12345."
"Supporting data are available to bona fide researchers, subject to registration, from the UK Data Service at http://doi.org/10.15125/12345."