DataSHIELD is an R library that enables the remote and non-disclosive analysis of sensitive research data.

DataSHIELD was born of the requirement in the biomedical and social sciences to co-analyse individual patient data (microdata) from different sources, without disclosing identity or sensitive information. Under DataSHIELD, raw data never leave the data provider and no microdata or disclosive information can be seen by the researcher. The analysis is taken to the data - not the data to the analysis. It provides a flexible, modular, open-source solution ideally placed to grow a broad user and development community.

DataSHIELD circumvents key obstacles preventing or limiting the open analysis of digital datasets. Irrespective of discipline, data access and analysis barriers result from a range of considerations:

  • ethical-legal restrictions surrounding confidentiality and the sharing of, or access to, disclosive data;
  • implications of extensive professional investment, intellectual property issues or licensing conditions constraining unconditional access to raw data;
  • the physical size of the data is a limiting factor.