University of Edinburgh Data Safe Haven: a new facility for sensitive data

Information Services has implemented a remote-access “Safe Haven” environment to protect data confidentiality, satisfy concerns about data loss and reassure Data Controllers about the University’s secure management and processing of their data in compliance with Data Protection Legislation.

The Data Safe Haven (DSH) provides a secure storage space and a secure analytic environment that is appropriate for all research projects working with different kinds of sensitive data. It has its own firewall and is isolated from the University network. It is located in a secure facility with controlled access. All traffic between the DSH and the user’s computer is encrypted and no internet access is available. Access to the DSH is only for authorized users via an assigned ‘Yubikey’ and secure VMware Horizon Client, and will only be available from the managed desktops that are white listed for access to the DSH.

Provision of a range of analytic and supporting applications (e.g., SPSS, STATA, SAS, MATLAB, and R) is available. These are delivered dynamically and are assigned to the project. The applications that are available to the users will depend on the type of arrangement that has been made with the DSH technical team prior to the project registration and on the licensing arrangements with the software provider.

The DSH initial security review (penetration test) was carried out by a CREST accredited organisation in August 2018. The DSH exhibited an overall good security stance and demonstrated resilience against the various types of tests performed by the consultants. This was the initial review that formed part of our ongoing drive towards ISO 27001 certification. We expect to complete this phase of the project and obtain the certificate by November 2019.

We have successfully closed the pilot phase of the DSH with five projects in October 2018, and softly launched the service at our “Dealing with Data” conference in November 2018. At present, the DSH Technical team has been migrating Centre for Clinical Brain Sciences – National CJD Research and Surveillance project data from the walled garden into the DSH.

The DSH operates on a cost recovery basis and this cost should be included in grant applications. We welcome enquiries from researchers as early as possible in their project planning. Costing is based on bespoke project requirements (see DSH Overview for users at https://www.ed.ac.uk/is/data-safe-haven.

The DSH Operations team also provides:

  • advice and input for funding and permissions applications;
  • guidance on meeting Approved Researcher requirements;
  • advice about meeting data sharing requirements and archiving of data.

We can set up a demo environment for researchers on request to explore the use of the DSH for their projects. If you need further information, please contact the RDS Team via data-support@ed.ac.uk.

Cuna Ekmekcioglu, Data Safe Haven Manager, Research Data Service

Updates from the fourth meeting of the RDM Forum

Guest blog post by Ewa Lipinska

On 28th August members of the RDM Forum gathered in the stunning Old Library at the Department of Geography in the Old Infirmary building, to hear the latest updates from the Research Data Service team and discuss all things data. It’d been a good few months since the last time we met, so the event presented us with the perfect opportunity to catch up on new developments, network with colleagues working on RDM in different parts of the University, and prepare ourselves for the new academic year which will see the University take up a pivotal role in making Edinburgh the Data Capital of Europe.

We started off with an RDM update from Cuna Ekmekcioglu, who gave us an overview of developments to University research data services: the launch of interim DataVault long-term retention service, continuing development of Data Save Haven aimed at research projects dealing with sensitive data, and a new release of DataShare which will allow larger datasets. We also learned about RDM training courses planned for the new academic year, most of which can be booked via MyEd.

Next, Pauline Ward gave a presentation which went into a bit more detail about the DataVault service allowing researchers to comply with their funders’ requirements to preserve data for the long term in cases where the datasets cannot be made public. The current interim service requires a mediated deposit which can be done by contacting data-support[at]ed.ac.uk. Comprehensive guidance on how to prepare your data before storing it in DataVault can be found on the service website.

This was followed by a demonstration of the new Research Data Service promotional video which outlines the range of tools and support offered by the team, and which can be a very good resource for new members of staff who would like to find out about the types of services available. Diarmuid McDonnell who presented the video also gave us a quick overview of a recent project called Scoping Statistical Analysis Support, which looked at the demand for statistical analysis training for current postgraduate students. The final project report is full of current information about statistical training around the University.

We then went on to discuss the potential impact of data sharing, which tied in nicely with a recent panel discussion at Repository Fringe 2017 that focused on how repositories and associated services can feature in supporting researchers to achieve and evidence impact in preparation for the next Research Excellence Framework exercise (live notes from the day are available). Pauline Ward presented examples of popular public datasets by Edinburgh University researchers, described ways to access information about their usage, and talked about how datasets can be shared more widely to engage external audiences, which may lead to potential impact. Even though on their own research data usage statistics are not enough to demonstrate significant impact beyond academia, they are a good (though perhaps still slightly overlooked) starting point for tracking how and by whom datasets are used, and how that benefits individuals and communities.

The meeting concluded with a presentation by Robin Rice, who shared with us the draft Research Data Service Roadmap. As the goals set out in the previous roadmap have now largely been achieved, the time has come to look to the future and identify new objectives for the next few years. It was interesting to hear about the team’s long-term plans which include unification of the service (aiming to ensure the best user experience and interoperability between systems), advocacy of data management planning, support around active data, enhanced data stewardship, improved communications and more training opportunities.

Overall, it was a very useful and informative meeting, and I’d very much encourage anyone interested in research data management and sharing to join us next time. In the meantime Cuna’s slides, together with lots of other useful resources and points for discussion, are available on the RDM Sharepoint (access on request).

Ewa Lipinska
Research Outcomes Co-Ordinator
College of Arts, Humanities and Social Sciences

Highlights from the RDM Programme Progress Report: February to April 2016

The membership of the Research Data Service Virtual Team across four divisions of IS was confirmed and met for the first time (to replace the former action group meetings) on 11 February where it was agreed meetings would be held approximately every six weeks for information and decision-making.

In February, the DataShare metadata was mapped to the PURE metadata and staff in L&UC and Data Library trained each other for creating dataset records in Pure and reviewing submissions in DataShare. It was agreed that staff would create records in Pure for items deposited in DataShare until the company (Elsevier) provides a mechanism for automatically inputting records into Pure.

In March, Jisc announced that the University of Edinburgh was selected as a framework supplier for their new Research Data Management Shared Service.

A review of the existing ethics processes in each college is in progress with Jacqueline McMahon at the College of Arts, Humanities and Social Sciences (CAHSS) to create a University-wide ethics template. There is also engagement with the School ethics committees at the School of Health in Social Sciences (HiSS), Moray House School of Education (MHSE), Law and School of Social and Political Science (SPS) in CAHSS.

The Research Data Management and Sharing (RDMS) Coursera MOOC opened for enrolment on 1 March 2016. This was completed in partnership with the University of North Carolina-Chapel Hill CRADLE project. Research Data Management and Sharing (RDMS) MOOC stats from the Coursera Dashboard reveal that as of 23 May 2016, there have been 5,429 visitors and 1,526 active learners; 335 visitors have completed the course.

The large data sharing investigation was completed for DataShare and reported previously. (Two new releases in DataShare defined: upload and download). Upload release (2.1) to go live 23 May 2016.

PURE dataset functionality is now included in standard PURE and Research Data Management (RDM) training. There are now 210 dataset records in PURE.

Four PhD interns were hired in mid-March to act as College representatives for the IS Innovation Fund Pioneering Research Data Exhibition. They will be employed until mid-December 2016.

A total of 363 staff and postgraduates attended RDM courses and workshops during this quarter.

There were 30 new DMPonline users and 55 new plans created during this quarter.

There are now 210 dataset metadata records in PURE.

A total of 56 datasets were deposited in DataShare during this quarter.

The total number of DataStore users rose from 12,948 in the previous quarter to 13,239 in this quarter, an increase of 291 new users.

National and International Engagement Activities

In February

  • Stuart Lewis gave a DataVault presentation at the International Digital Curation Conference (IDCC) in Amsterdam.

In March

  • A University news item was released to mark the launch of the Research Data Management and Sharing (RDMS) MOOC on Coursera. http://www.ed.ac.uk/news/2016/dataskills-010316
  • Stuart MacDonald gave an RDM presentation to trainee physicians at the Royal College of Physicians Edinburgh Course: Critical appraisal and research for trainees, Edinburgh. http://www.slideshare.net/smacdon2/rdm-for-trainee-physicians
  • Three delegates from Göttingen University were hosted here. The delegates have shared interests in RDM and visited to gain more insight into RDM support and experiences here.
  • Robin Rice gave an invited talk about the RDMS MOOC and web-based Survey Documentation and Analysis (SDA) tool to Learning, Teaching and Web and elearning@Ed Showcase and Network monthly gathering.

In April

As part of my responsibilities to cover the one year interim of Kerry Miller’s maternity leave, I will be writing blogs for this page until Kerry returns next summer.

Prior to this post, I worked the past 12 years as the geospatial metadata co-ordinator at EDINA. My primary role was to promote and support research data management and sharing amongst UK researchers and students using spatial data and geographical information.

Tony Mathys
Research Data Management Service Co-ordinator


Highlights from the RDM Programme Progress Report: November 2015 – January 2016

Data Seal of Approval have awarded DataShare Trusted Repository status; their assessment of our service can be read at https://assessment.datasealofapproval.org/assessment_175/seal/html/. In addition a major new release of DataShare was completed in November, this makes the code open in Github as well as making general improvements to the look and feel of the website.

The ‘interim’ DataVault is now in final testing and will be rolled out on a request basis to those researchers who can demonstrate an urgent need to use the service now rather than waiting until the final version is ready later this year. The phase three funding for development of the DataVault has been received from Jisc, this runs from March to August, so the final version should be ready for launch sometime after this. The project was presented at the International Digital Curation Conference in February 2016.

Over the three month period a total of 328 staff and postgraduate researchers have attended a Research Data Management (RDM) course or workshop.

Work on the MANTRA MOOC (Massive Open Online Course) was expected to be finalised in February and launched on 1st March, at the following URL: https://www.coursera.org/learn/data-management.

University of Edinburgh wrote the Working with Data section (one out of 5 weeks of the course) and with the help of the Learning, Teaching and Web division of Information Services completed two video interviews with researchers and a ‘vox pop’ video clip of clinical researchers at the EQUATOR conference in Edinburgh in autumn, 2015. The content is open source and videos can be added to our YouTube channel to help with promotion. There will be some income from this, but a smaller portion than our partner, the University of North Carolina, based on certificates of completion priced at $49 or £33.

The need to create a dataset record in PURE for each dataset published, or referenced in a publication, is now being emphasised in all Research Data Service communications, formal and informal, and to staff at all levels. Uptake is understandably low at this point but we hope to see a steady increase as researchers and support staff begin to see the benefits of adding datasets to their research profile. In the case of DataShare records, a draft mapping of fields between DataShare and PURE has been produced as a start of a plan for migrating records from DataShare to PURE.

By the end of January 2016, 69 records had been created and published on Edinburgh Research Explorer.

Four interns have been employed using funding from Jisc as part of the UK Research Data Discovery Service (UKRDDS) project which aims to create a national aggregate register of data sets.  A trial site is available at: http://ckan.data.alpha.jisc.ac.uk/. The UKRDDS interns will help to create PURE records and upload open data into DataShare, and raise awareness of RDM generally within their schools. There are currently three PhD interns in place in LLC, SOS, and Roslin, two more in LLC, & DIPM will start in February. The approach each intern takes will depend on the nature and structure of their school and will, in some cases, be mediated by research administrators.

An innovation fund grant has been received to fund the delivery of an exhibition “Pioneering Research Data”. Each college will be represented by a PhD intern, the recruitment of these has already begun and they should be in post by the end of March. The Exhibition is due to be delivered in November of this year.

National and International Engagement Activities

Robin Rice led a panel at the IPRES conference, Chapel Hill, North Carolina, on 3rd November called ‘Good, better, best’? Examining the range and rationales of institutional data curation practices’.

Robin Rice had a proposal accepted for the forthcoming Force11 (2016) conference, on Overcoming Obstacles to Sharing Data about Human Subjects, building on the training course we are delivering, Working with Personal and Sensitive Data.

Kerry Miller
RDM Service Coordinator