Research Data Workshops: Sensitive Data Challenges and Solutions

This workshop at the Bioquarter was attended by 27 research staff representing all three colleges, with a majority of Medicine and Veterinary Medicine. It began with an introductory presentation from Robin Rice covering the new Data Safe Haven facility of the Research Data Service and and was followed by brief presentations from Lynne Forrest (Research Support Officer on Scottish Longitudinal Study); Fiona Strachan (Clinical Research Manager, Centre for Cardiovascular Science); and Jonathan Crook (Professor of Business Economics). Each speaker shared their experiences of both conducting research using sensitive data and supporting other researchers. Although they work with very different types of data it was easy to identify certain common requirements:

  • Easy access to secure data storage and analysis platforms;
  • Consistent & comprehensive training and guidance on working with sensitive data;
  • Support to meet the necessary requirements to gain access to the data they need;

In the discussion groups that followed, participants were asked about their experiences working with sensitive data, the requirements researchers needed services such as data safe havens to fulfil, and ramifications of the cost recovery model, with regard to including costs in grant proposals.

The major themes that emerged were concerns around training, data governance, and concerns about meeting costs for protecting sensitive data. There was a strong feeling that more and better training was required for all those working with sensitive data. There was also confusion about the number, location, and criteria of different Data Safe Havens now available, and no single place to find clear information on these.

When talking specifically about the Data Safe Haven offered by IS for UoE researchers, the biggest concern was around cost. The standard price was considered high for the majority of grants, which are either small or need to be highly competitive. In some disciplines grant funding is not common and so it is unclear how the costs would be able to be met. The Research Data Service representatives encouraged people to get a bespoke quote and discuss requirements with the team as early as possible, as flexibility on both cost and build specifications (e.g. high performance computing) is built-in.

Some specific points arising from the discussions were:

  • One negative experience about working with sensitive data is the length of time needed to get data approvals (e.g. from NHS bodies). Participants wondered if the University could help to speed those up.
  • More training was desired in sensitive data management and better ways to structure training for students.
  • Learning outcomes need to focus on change of behaviour; with focus on local procedures.
  • One participant felt that schools need a researcher portfolio system, some way of keeping track of who has what data. A suggestion was made to have an asset manager in the university, similar to the one in NHS.
  • Less than optimal security practices can be observed, such as leaving a clinical notebook in a coffee room. More training is needed but this is not fully covered in either clinical practice courses nor ethics.
  • There were concerns around data governance – how to set up gatekeepers for research projects using Data Safe Haven, how long to store things in the DataVault. ACCORD was pointed to for having good structure in data governance.
  • Long-running projects (e.g. ten years) would have trouble meeting the annual costs.
  • Projects are invested in locally run services and expertise; added value centralised services need to be low-cost.

Overall researchers were in favour of having a Data Safe Haven available for projects that need it, but they would also like to have support to correctly anonymise and manage their data so that they could continue to use standard data storage and analysis platforms. This would mean that only those with the most sensitive of data would need to rely upon the UoE DSH to conduct their research.

Those with a University log-in may read the full set of notes on the RDM wiki.

Kerry Miller
Research Data Support Officer
Library & University Collections

University of Edinburgh Data Safe Haven: a new facility for sensitive data

Information Services has implemented a remote-access “Safe Haven” environment to protect data confidentiality, satisfy concerns about data loss and reassure Data Controllers about the University’s secure management and processing of their data in compliance with Data Protection Legislation.

The Data Safe Haven (DSH) provides a secure storage space and a secure analytic environment that is appropriate for all research projects working with different kinds of sensitive data. It has its own firewall and is isolated from the University network. It is located in a secure facility with controlled access. All traffic between the DSH and the user’s computer is encrypted and no internet access is available. Access to the DSH is only for authorized users via an assigned ‘Yubikey’ and secure VMware Horizon Client, and will only be available from the managed desktops that are white listed for access to the DSH.

Provision of a range of analytic and supporting applications (e.g., SPSS, STATA, SAS, MATLAB, and R) is available. These are delivered dynamically and are assigned to the project. The applications that are available to the users will depend on the type of arrangement that has been made with the DSH technical team prior to the project registration and on the licensing arrangements with the software provider.

The DSH initial security review (penetration test) was carried out by a CREST accredited organisation in August 2018. The DSH exhibited an overall good security stance and demonstrated resilience against the various types of tests performed by the consultants. This was the initial review that formed part of our ongoing drive towards ISO 27001 certification. We expect to complete this phase of the project and obtain the certificate by November 2019.

We have successfully closed the pilot phase of the DSH with five projects in October 2018, and softly launched the service at our “Dealing with Data” conference in November 2018. At present, the DSH Technical team has been migrating Centre for Clinical Brain Sciences – National CJD Research and Surveillance project data from the walled garden into the DSH.

The DSH operates on a cost recovery basis and this cost should be included in grant applications. We welcome enquiries from researchers as early as possible in their project planning. Costing is based on bespoke project requirements (see DSH Overview for users at https://www.ed.ac.uk/is/data-safe-haven.

The DSH Operations team also provides:

  • advice and input for funding and permissions applications;
  • guidance on meeting Approved Researcher requirements;
  • advice about meeting data sharing requirements and archiving of data.

We can set up a demo environment for researchers on request to explore the use of the DSH for their projects. If you need further information, please contact the RDS Team via data-support@ed.ac.uk.

Cuna Ekmekcioglu, Data Safe Haven Manager, Research Data Service