Data Management

What is Data Management?

Most of us have experienced the headache that comes with mismanaged data. Maybe your computer crashes before you backed up the paper you were writing, or you misplace a laboratory notebook, or you spend hours digging through disorganized files to find a document from several years ago that you never expected to use again. Data management is the practice of taking care of your research data so that it is easier to use, easier to recover and easier to share. Because research data is valuable, many grant funders now require researchers to have a data management plan in place to protect their investment in the research.

Data management consists of practices that occur at all phases of the research cycle: planning for data management before the project begins; documenting, organizing and securing data during the project; and archiving data after the research is completed. Good data management makes it possible to recover and share data for future research, completing the “data lifecycle.”

Best Practices and Guidelines

While most research projects do not require a formal data management plan, implementing some basic data management principles can save you time down the road and may even improve the quality of your research.

Documentation and Description

  • When beginning a project, create a file that describes the personnel, funding sources, methods or techniques, software and references used in the research.
  • Keep raw data separate from derived, analyzed data, and describe the method you used to do the analysis.
  • Use metadata (“data about data”) to explain your data set, perhaps in a “readme” text file or a database. Some disciplines have established metadata standards.

Organization

  • Choose a consistent file system that would make sense to someone else looking at your data five years from now.
  • Assign descriptive file names that describe relevant and meaningful aspects of your study. For example, name your file “DOLInterview_DoeJane_20061207” rather than “myData”.
  • Use capital letters or underscores between words, rather than spaces.
  • Do not rely on the directory hierarchy to provide critical information about the file contents, since context will be lost when files are copied elsewhere.

Storage and Backup

  • Back up your data frequently – automatically if possible. Follow the “Rule of 3”: keep a working copy, a second local copy (e.g., on an external hard drive), and a remote copy (e.g., in cloud storage).
  • Since they are easily lost, use flash drives only for transferring data, not storing data.

Preservation

  • For long-term storage and preservation, use a managed database or archive.
  • Convert data to open, stable formats (ascii, txt, csv, pdf) instead of proprietary formats (xls, doc, psd).

Writing a Data Management Plan

Some projects may call for a formal data management plan (DMP). This may be true if your granting agency requires such a plan, if you plan to publicly share your data, or if your project will involve large, complex data sets. The following resources give some guidance on creating a formal DMP:

Other Links and Resources