Collaborating on data in a modern way

Between mid-September and mid-October, the Research Data Support team hosted an international visitor. Dr Tamar Israeli, a librarian from Western Galilee College in Israel, spent four weeks in Edinburgh to increase her experience and understanding around research data management. As part of this visit, Tamar conducted a study into our researchers’ collaborative requirements, and how well our existing tools and services meet their needs. Tamar’s PhD thesis was on the topic of file sharing, and she has recently published another study on information loss in Behaviour & Information Technology: “Losing information is like losing an arm: employee reactions to data loss” (2019). Tamar is also a representative of the Israeli colleges on the University Libraries’ Research Support Committee.

Tamar carried out a small-scale study in order to gain a better understanding of the tools that researchers use to collaborate around data, and to explore the barriers and difficulties that prevent researchers from using institutional tools and services. Six semi-structured interviews were conducted with researchers from the University of Edinburgh, representing different schools, and all of whom collaborate with other researchers on a regular basis on either small- or large-scale projects. She found that participants use many different tools, both institutional and commercial, to collaborate, share, analyse and transfer documents and data files. Decisions about which tools to use are based on data types, data size, usability, network effect and whether their collaborators are in the same institution and country. Researchers tend to use institutional tools only if they are very simple and user friendly, if there is a special requirement for this from funders or principal investigators (PIs), or if it is directly beneficial for them from a data analysis perspective; sharing beyond the immediate collaboration is only a secondary concern. Researchers are generally well aware of the need to keep their data where it will be safe and backed-up, and are not concerned about the risk of data loss. A major issue was the need for tools that answer projects’ particular needs, therefore customisability and scope for interlinking with other systems is very important.

We’d like to thank Tamar for the great work she did, and for the beautiful olive oil and pistachios that she brought with her! Tamar’s findings will key into our ongoing plans for the next phase of the Research Data Service’s continual development, helping us assist researchers to share and work on their data collaboratively, within and beyond the University’s walls.

Martin Donnelly
Research Data Support Manager
Library and University Collections

Research Data Workshops: DataVault Summary

Having soft-launched the DataVault facility in early 2019, the Research Data Support team -with the support of the project board – held five workshops in different colleges and locations to find out what the user community thought about it. This post summarises what we learned from participants, who were made up roughly equally of researchers (mainly staff) and support professionals (mainly computing officers based in the Schools and Colleges).

Each workshop began with presentations and a demonstration by Research Data Service staff, explaining the rationale of the DataVault, what it should and should not be used for, how it works, how the University will handle long-term management of data assets deposited in the DataVault, and practicalities such as how to recover costs through grant proposals or get assistance to deposit.

After a networking lunch we held discussion groups, covering topics such as prioritisation of features and functionality, roles such as the university as data asset owner, and the nature of the costs (price).

The team was relieved to learn that the majority (albeit from a somewhat self-selecting sample) agreed that the service fulfilled a real need; some data does need to be kept securely for a named period to comply with research funders’ rules, and participants welcomed a centralised platform to do this. The levels of usability and functionality we have managed to reach so far were met with somewhat less approval: clearly the development team has more work to do, and we are glad to have won further funding from the Digital Research Services programme in 2019-2020 in order to do it.

Attitudes toward university ownership of data assets was also a mixed bag; some were sceptical and wondered if researchers would participate in such a scheme, but others found it a realistic option for dealing with staff turnover and the inevitability of data outlasting data owners. Attitudes toward cost were largely accepting (the DataVault provides a cheaper alternative than our baseline DataStore disk storage), but concerns about the safekeeping of legacy and unfunded research data were raised at each workshop.

A sample of points raised follows:

  • Utility? “Everyone I know has everything on OneDrive.”
  • Regarding prioritisation of features – security first; file integrity first; putting data from other sources than DataStore; facilitating larger deposit sizes; ease of use.
  • Quickness of deposit and retrieval? Deposit was deemed more important to be quick than retrieval.
  • University as data asset owner?
    • Under GDPR the data are already university assets (because the Uni is the data controller).
    • People who manage the data should be close to the research; IT people can manage users but shouldn’t be making decisions about data. Danger that because it’s related to IT it gets dumped on IT officers. The formal review process helps to ensure decisions will be made properly. Include flexibility into the review hierarchy to allow for variation in school infrastructure.
    • When I heard that I was – not shocked – but concerned. If I move to another university how do I get access? This might be a problem. Researchers might prefer to retain three copies themselves.
  • Is the cost recovery mechanism valid?
    • Vault costs are legitimate costs.
    • Ideally should come from grant overheads, until then need to charge.
    • Possible to charge for small / medium/large project at start rather than per TB?
  • Is the 100 GB threshold sufficient for unfunded research? How else could unfunded or legacy data be covered (who pays)?
    • Alumni sponsor a dataset scheme?
    • There will be people with a ‘whole bunch of data somewhere’ that would be more appropriately stored in DataVault.

The team is grateful to all of the workshop participants for their time and thoughts; the report will be considered further by the project board and the Research Data Service Steering Group members. The full set of workshop notes are colour-coded to show comments from different venues and are available to read on the RDM wiki, for anyone with a University log-in (EASE).


Robin Rice
Data Librarian and Head, Research Data Support
Library & University Collections

Dealing with Data 2019 – Registration now open!

Collaboration Across the Nations: Managing, sharing and securing research data across space and time

UPDATE – DwD 2019 Postponed

Due to the strike action which is scheduled to happen on the 27th of November we have decided to postpone DwD2019. This was not a decision we took lightly but we felt it was for the best as we did not wish to put anyone in the uncomfortable position of feeling they had to either cross the picket line or not attend DwD2019. DwD2019 has been provisionally rebooked for the 15th of January 2020, any resulting changes to the programme or other details will be added here as and when they are confirmed.

Dealing with Data 2019 will take place from 09:30 – 16:15 on the 15th January 2020 in the Informatics Forum. This year our theme is “Collaboration Across the Nations: Managing, sharing and securing research data across space and time” and we are now inviting all staff and post-graduate students at the University of Edinburgh to register for this event.

Collaboration is vitally important to academic and commercial research in all areas as it enables the pooling of resources to answer increasingly complex, or interdisciplinary research questions.

The effective collection, processing, and sharing of research data is integral to successful collaborations, but it can also present many challenges. In particular the practicalities of co-ordinating data management across large multi-centre collaborations, sharing large data, or handling sensitive data, can present difficulties if not planned for appropriately.

Dealing with Data 2019 is your opportunity to hear from, and network with, other members of the UoE research community about how they have addressed these issues to build successful collaborations, or the lessons they have learned which will enable them to be more successful in the future.

In previous years DwD has attracted over 100 attendees from across the university to hear contributions by research staff and students at all stages of their careers and from diverse disciplines. You can view the presentations from 2017 & 2018 now on MediaHopper (https://media.ed.ac.uk/channel/Dealing+With+Data+2017+Conference/82256222)

Conference Programme – Dealing_with_Data_2019_Programme_V1.2

If you have any questions please get in touch using  dealing-with-data-conference@mlist.is.ed.ac.uk

Dealing with Data is an annual event sponsored and organised by the Research Data Service to provide a forum for University of Edinburgh researchers to discuss how they are benefiting from, or experiencing struggles with, the fast-changing research data environment.

Kerry Miller

Research Data Support Officer

Research Data Workshops: Sensitive Data Challenges and Solutions

This workshop at the Bioquarter was attended by 27 research staff representing all three colleges, with a majority of Medicine and Veterinary Medicine. It began with an introductory presentation from Robin Rice covering the new Data Safe Haven facility of the Research Data Service and and was followed by brief presentations from Lynne Forrest (Research Support Officer on Scottish Longitudinal Study); Fiona Strachan (Clinical Research Manager, Centre for Cardiovascular Science); and Jonathan Crook (Professor of Business Economics). Each speaker shared their experiences of both conducting research using sensitive data and supporting other researchers. Although they work with very different types of data it was easy to identify certain common requirements:

  • Easy access to secure data storage and analysis platforms;
  • Consistent & comprehensive training and guidance on working with sensitive data;
  • Support to meet the necessary requirements to gain access to the data they need;

In the discussion groups that followed, participants were asked about their experiences working with sensitive data, the requirements researchers needed services such as data safe havens to fulfil, and ramifications of the cost recovery model, with regard to including costs in grant proposals.

The major themes that emerged were concerns around training, data governance, and concerns about meeting costs for protecting sensitive data. There was a strong feeling that more and better training was required for all those working with sensitive data. There was also confusion about the number, location, and criteria of different Data Safe Havens now available, and no single place to find clear information on these.

When talking specifically about the Data Safe Haven offered by IS for UoE researchers, the biggest concern was around cost. The standard price was considered high for the majority of grants, which are either small or need to be highly competitive. In some disciplines grant funding is not common and so it is unclear how the costs would be able to be met. The Research Data Service representatives encouraged people to get a bespoke quote and discuss requirements with the team as early as possible, as flexibility on both cost and build specifications (e.g. high performance computing) is built-in.

Some specific points arising from the discussions were:

  • One negative experience about working with sensitive data is the length of time needed to get data approvals (e.g. from NHS bodies). Participants wondered if the University could help to speed those up.
  • More training was desired in sensitive data management and better ways to structure training for students.
  • Learning outcomes need to focus on change of behaviour; with focus on local procedures.
  • One participant felt that schools need a researcher portfolio system, some way of keeping track of who has what data. A suggestion was made to have an asset manager in the university, similar to the one in NHS.
  • Less than optimal security practices can be observed, such as leaving a clinical notebook in a coffee room. More training is needed but this is not fully covered in either clinical practice courses nor ethics.
  • There were concerns around data governance – how to set up gatekeepers for research projects using Data Safe Haven, how long to store things in the DataVault. ACCORD was pointed to for having good structure in data governance.
  • Long-running projects (e.g. ten years) would have trouble meeting the annual costs.
  • Projects are invested in locally run services and expertise; added value centralised services need to be low-cost.

Overall researchers were in favour of having a Data Safe Haven available for projects that need it, but they would also like to have support to correctly anonymise and manage their data so that they could continue to use standard data storage and analysis platforms. This would mean that only those with the most sensitive of data would need to rely upon the UoE DSH to conduct their research.

Those with a University log-in may read the full set of notes on the RDM wiki.

Kerry Miller
Research Data Support Officer
Library & University Collections