Skip to Main Content

Data Management

Resources to help you manage, store, and share your data.

Why Store your Data?

Properly storing your data will ... 

  • allow you to maintain access to your research over time – keep in mind, digital files degrade or can easily become corrupted if not maintained!

  • allow you to share your data with the broader research community.

  • will support the overall authenticity of any given work.


Storage options

For storage and processing of your data during the active phase of research, there are several options:

  • CPP students and faculty have access to Microsoft OneDrive – though access is limited to the time that you are on campus, once you graduate you will lose use of the program.
  • For researchers working with very large data sets, you can utilize that High-Performance Computing Cluster on campus. Reach out to the HPC directly for support.

 

Back up your data

  • Make 3 copies (e.g. original + external/local + external/remote)
  • Copies should be geographically distributed (local vs. remote)

 

Data backup options

  • Personal computer hard drives, external hard drives, departmental or university servers, when available.
  • CDs or DVDs aren’t recommended, because they fail frequently.
  • Cloud storage: there are multiple commercial options available (Google Drive, DropBox, Box, Tresorit, etc.); each have different requirements, encryption, and storage fees.

 

Security

  • Unencrypted security is ideal for storing your data so that you and others can easily read it, but if encryption is required because of sensitive data:
    • Keep passwords and keys on paper (2 copies) and in a PGP (pretty good privacy) encrypted digital file, or within a secure password manager (like LastPass or KeePass).
    • Don’t rely on 3rd party encryption alone.
  • Uncompressed is also ideal for storage, but if you need to do so to conserve space limit compression to your 3rd backup copy.

 

To make sure your backup system is working properly, test your system periodically. Try to retrieve data files and make sure you can read them.

The above information was adapted from MIT's Data Management Resource Guide.

Data Preservation Basics

In addition to storing your data, repositories can help preserve it as well:

What is data preservation?

  • Data preservation involves converting your data to a preservation format and storing it within a special preservation repository where it will be actively stewarded—managed and migrated as formats change.
  • When you deposit a tabular data file (e.g., CSV, Excel, SPSS, RData) in the Harvard Dataverse, the repository will automatically convert your data into a format that increases the likelihood that it can be preserved.

What can I do to ensure my data can be preserved?

  • The most important step you can take is to deposit and share your data in a data repository. Most repositories will convert your data to sustainable formats.
  • During the planning stage of your project, you can also identify which of your data are stored in proprietary file formats and migrate the data to more sustainable file formats before depositing it into a repository. Find more information on recommended preservation formats at the Library of Congress Sustainability of Digital Formats.

What are my responsibilities for my data once my research has concluded?

  • Knowing which data and records are safe to dispose of can be complicated to determine. Make sure you understand any applicable contractual or regulatory requirements that may govern data disposition.

Finding a Home for Your Data

The above video was created by Biomedical Data Management at Harvard Medical School. Some aspects of the video are institutionally specific and will not apply to your research at CPP – please contact a librarian for campus-based support.

Are you looking for more resources to guide your storage process? Check out the following resources, include the Store & Manage Checklist, from the Longwood Medical Area Research Data Management Working Group at Harvard Medical School.

Repositories


Institutional Repository

  • Bronco Scholar: Cal Poly Pomona's institutional repository is a place to deposit your research or other creative projects for long-term access and preservation. 

Select Disciplinary Repositories

  • ArXiv, is an open access repository for Computer Science, Physics, Mathematics, Nonlinear Sciences, Quantitative Biology, and Statistics. 
  • Citseer, focused on Computer and Information Science, Citseer is a search engine for research papers as well as a repository.
  • Cogprints accepts self-archived papers in any area of Psychology, Neuroscience, and Linguistics, as well as many areas of Computer Science, Philosophy, Biology, Medicine, Anthropology, and other sciences related to the study of cognition.
  • RePEC is a repository for research in Economics and Related Sciences. Researchers can deposit working papers, journal articles, books, book chapters and software components.