Proper citation of data sources has both immediate and long term benefits to users and producers of data. “Data citation is the practice of referencing data products used in research. A data citation includes key descriptive information about the data, such as the title, source, and responsible parties” (https://www.usgs.gov/data-management/data-citation).
Citing data is very similar to citing publications; there are many "correct" formats to use, but we suggest including the following important information:
The order of the information is not as important as having sufficient information to find the data set(s) used. Consider the style guidelines of the research domain or lab group, data source, or preferred publisher (see "Related information" below).
A suggested citation format may be specified by some publishers, with specific additional information (e.g. resource type, retrieval data, funder/sponsor). They may also request citation of related publication(s) along with the data. Be sure to review citation style guides carefully. When citation formats are not specified, you can follow your discipline's scholarly citation style. The next section provides examples of common repository styles, as well as APA/MLA/Chicago styles.
Style |
Example(s) |
More information |
APA (6th edition) |
Smith, T.W., Marsden, P.V., & Hout, M. (2011). General social survey, 1972-2010 cumulative file (ICPSR31521-v1) [data file and codebook]. Chicago, IL: National Opinion Research Center [producer]. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor]. doi: 10.3886/ICPSR31521.v1 |
IASSIST guidelines |
Chicago |
Smith, Tom W., Peter V. Marsden, and Michael Hout. 2011. General Social Survey, 1972-2010 Cumulative File. ICPSR31521-v1. Chicago, IL: National Opinion Research Center. Distributed by Ann Arbor, MI: Inter-university Consortium for Political and Social Research. doi:10.3886/ICPSR31521.v1 |
IASSIST guidelines |
DataCite |
Barclay, Janet Rice (2013) Stream Discharge from Harford, NY. Cornell University Library eCommons Repository. http://hdl.handle.net/1813/34425 Malekjani, Shokoufeh (2012) Microstructural response of nanocrystalline Al to cyclic loading. Deacon Research Online. http://hdl.handle.net/10536/DRO/DU:30045928 |
DataCite guidelines |
DRYAD |
Yannic G, Pellissier L, Dubey S, Vega R, Basset P, Mazzotti S, Pecchioli E, Vernesi C, Hauffe HC, Searle JB, Hausser J (2012) Data from: Multiple refugia and barriers explain the phylogeography of the Valais shrew, Sorex antinorii (Mammalia: Soricomorpha). Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.2jj36325 |
DRYAD guidelines |
ESIP |
Cline, D., R. Armstrong, R. Davis, K. Elder, and G. Liston. 2003. CLPX-Ground: ISA snow depth transects and related measurements ver. 2.0. Edited by M. A. Parsons and M. J. Brodzik. NASA National Snow and Ice Data Center Distributed Active Archive Center. https://doi.org/10.5060/D4MW2F23. Accessed 2008-05-14. |
ESIP guidelines |
ICPSR |
Jacob, Philip, and Henry Teune. International Studies of Values in Politics, 1966. ICPSR07006-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 1978. doi:10.3886/ICPSR07006.v1 |
ICPSR guidelines |
Figshare |
Rodriguez, Tommy (2013): 17,170 Base Pair Alignment of Thirteen Time-Extended Lineages [data: (complete) mtDNA; format: ClustalW]. figshare. https://dx.doi.org/10.6084/m9.figshare.815894 Retrieved: 16 26, Jan 04, 2016 (GMT) |
Figshare guidelines |
MLA (7th edition) |
Smith, Tom W., Peter V. Marsden, and Michael Hout. General Social Survey, 1972-2010 Cumulative File. ICPSR31521-v1. Chicago, IL: National Opinion Research Center [producer]. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2011. Web. 23 Jan 2012. doi:10.3886/ICPSR31521.v1 |
IASSIST guidelines |
The Digital Curation Centre (DCC) provides additional guidance on how to cite datasets and link to publications
The Data Citation Principles cover purpose, function and attributes of citations. These principles recognize the dual necessity of creating citation practices that are both human understandable and machine-actionable.
These citation principles are not comprehensive recommendations for data stewardship. And, as practices vary across communities and technologies will evolve over time, we do not include recommendations for specific implementations, but encourage communities to develop practices and tools that embody these principles.
The principles are grouped so as to facilitate understanding, rather than according to any perceived criteria of importance.
Data should be considered legitimate, citable products of research. Data citations should be accorded the same importance in the scholarly record as citations of other research objects, such as publications.
Data citations should facilitate giving scholarly credit and normative and legal attribution to all contributors to the data, recognizing that a single style or mechanism of attribution may not be applicable to all data.
In scholarly literature, whenever and wherever a claim relies upon data, the corresponding data should be cited.
A data citation should include a persistent method for identification that is machine actionable, globally unique, and widely used by a community.
Data citations should facilitate access to the data themselves and to such associated metadata, documentation, code, and other materials, as are necessary for both humans and machines to make informed use of the referenced data.
Unique identifiers, and metadata describing the data, and its disposition, should persist — even beyond the lifespan of the data they describe.
Data citations should facilitate identification of, access to, and verification of the specific data that support a claim. Citations or citation metadata should include information about provenance and fixity sufficient to facilitate verifying that the specific timeslice, version and/or granular portion of data retrieved subsequently is the same as was originally cited.
Data citation methods should be sufficiently flexible to accommodate the variant practices among communities, but should not differ so much that they compromise interoperability of data citation practices across communities.
For further information, please refer to these examples.
Data Citation Synthesis Group: Joint Declaration of Data Citation Principles. Martone M. (ed.) San Diego CA: FORCE11; 2014 https://doi.org/10.25490/a97f-egyk