Dealing with Data 2019 (January 2020): Collaboration Across the Nations

Picture the scene: A cold January day, the wind blowing the scarves of the passers-by through the large windows of the Informatics Forum meeting room. The group inside listens, takes notes, tweets, and asks questions of the speakers, representing a range of disciplines across the University…

Dealing with Data is an annual event hosted by the Research Data Service. Its aim is to engage the University community of researchers and support professionals around a theme, to share success stories and challenges in the myriad, everyday issues involved with data-driven research. The theme this year reflected the difficulty of managing research data in large, collaborative projects. Due to industrial action, the original November event was postponed to January. Around a hundred researchers – staff and students – participated, along with support staff who gave lightning talks about research-focused services. Full presentations and videos are now available.

So Benjamin Bach, our keynote speaker, inspired us with state of the art data visualisation software and techniques for both exploration and presentation. But he also illustrated the difficulties of portraying all of the data in all of its facets of a rich dataset, and the consequences of making necessary choices for its interpretation.
The first session began with Tamar Israeli’s study of researchers’ use of collaborative and institutional tools showed the challenges of making local infrastructure user friendly enough to attract new users familiar with slick cloud-based services. Then Mark Lawson demonstrated his ingenuous ‘ethical hacking’ to piece together a set of APIs to create a research workflow for samples and images for histology research. Minhong Wang conveyed a higher level view of data management focused not just on data-driven, but knowledge-driven phenotyping.

Next were the lively lightning talks, in which Mike Wallis of Research Services warned of a new Digital Dark Age, and David Creighton-Offord spoke of the dillemmas in Information Security user support where shiny doesn’t always equal safe. Lisa Otty spoke of innovative training and text mining projects bringing data science to the Humanities, and Rory MacNeil demonstrated how the RSpace electronic lab notebook can connect to a host of popular open science tools.

Following a lively lunch with chat between delegates and with hosts of the service exhibitions, Alex Hutchison showed a highly programmatic view of data management and ethics control from the UNICEF collaboration, in collecting and analysing real world data about children in need. Caileen Gallagher offered a case study of how food courier data could be used to empower workers. Sanja Badanjak shared her data integration problems of peace agreements around the world, conveying both innovative solutions and time-consuming workarounds.

In the final session Edward Wallace brought in the Edinburgh Carpentries to the rescue of poor data skills within Biological Sciences and the wider University – itself a great example of cross-community collaboration building a community of trainers. Gillian Raab showed us how any data problem however intractable can be solved by resourcefulness and determination, making use of DataShield for multi-party computation when datasets are too sensitive to be shared. Johnny Hay and Tomasz Zielinski demo’d their Plasmo ‘boutique repository’ for plant-systems biology modelling and Holly Tibble described tackling an international collaboration in linking administrative datasets via ‘ridiculously detailed’ statistical analysis plans. Representing the Research Data Service, I wrapped up proceedings with some of these very observations.
Both presentations and videos are available.

Welcome

  • Jeremy Upton, Director of Library and University Collections. [Presentation]

Keynote

  • Data Visualization for Exploration and Presentation, Prof. Benjamin Bach. Lecturer in Design Informatics and Visualization. [Presentation] [Slides]

Session 1 – Chair: Theo Andrew

  • “Data Something”: Assessing Tools, Services and Barriers for Research Data Collaboration at the University of Edinburgh – a small-scale study carried out by Dr Tamar Israeli with support from the Research Data Support team. Robin Rice – Data Librarian & Head of Research Data Support Services. [Presentation] [Slides]
  • Integrated secure web application to deliver centralised management of research samples, histology services and imaging data. Mark Lawson, Data & Project Manager, MRC Centre for Reproductive Health, QMRI. [Presentation] [Slides]
  • Building the Knowledge Graph for UK Health Data Science Minhong Wang et. al, Deanery of Molecular, Genetic and Population Health Sciences. [Presentation] [Slides]

Session 2 – Chair: Kerry Miller

  • The Data Opportunities & Challenges when Collaborating across Organisations
    Alex Hutchison, Delivery Director – Data for Children Collaborative with UNICEF. [Presentation] [Slides]
  • Restoring Gig Workers to Power: Personal Data Portability, Supply of Digital Content and Free Flow of Data in the European Data Economy. Cailean Gallagher, Scottish Trades Union Congress, & St Andrews University Institute of Intellectual History. [Presentation] [Slides]
  • Dealing with data in peace and conflict research. Sanja Badanjak, Postdoctoral Research Fellow, School of Law. [Presentation] [Slides]

Session 3 – Chair: Robin Rice

  • Bringing researchers to data: computing skills training with Edinburgh Carpentries.
    Edward Wallace, Sir Henry Dale Fellow, Institute of Cell Biology. [Presentation] [Slides]
  • Running an analysis of combined data when the individual records cannot be combined. Gillian M Raab and Chris Dibben, Scottish centre for Administrative Data Research. [Presentation] [Slides]
  • The grant is dead, long live the data. Johnny Hay and Tomasz Zieliński, School of Biology, University of Edinburgh. [Presentation] [Slides]
  • International collaborations using linked administrative data: Lessons from the MARIC study. Holly Tibble, Usher Institute, University of Edinburgh. [Presentation] [Slides]

Robin Rice
Data Librarian and Head, Research Data Support
Library & University Collections

Training researchers for a software and data-intensive world with Edinburgh Carpentries

This is guest post from Giacomo Peru and the EdCarp Committee (https://edcarp.github.io/committee/). Sections of this post were published previously on the EPCC blog.

EdCarpLogo

EdCarpLogo

The Edinburgh Carpentries (EdCarp) is a training initiative, which offers the Carpentries computing and data skills curriculum in Edinburgh. The workshops train researchers on fundamental skills needed for conducting efficient, open, and reproducible research. The EdCarp team comprises staff and student volunteers from across disciplines, academic units, and career stages.

Since 2018, EdCarp has organised 25 workshops across the academic institution, training over 300 staff and students in data cleaning, manipulation, visualisation and version control methods using tools such as R, python, Unix shell, Git, SQL and OpenRefine. Courses are free to participants and are oversubscribed very quickly. We are now rolling out our 2020 schedule and announcing workshops.

EdCarp are working to establish collaborations with other organisations, external and internal to the university: the Scottish Funding Council, the Institute for Academic Development and the Data Driven Innovation programme.

EdCarp can work with your academic unit or doctoral training program to help promote the fundamental data skills that your colleagues need.

A crucial aspect of EdCarp and their training model is the participation and voluntary commitment of the community, where trainees go to become helpers, helpers to instructors and so on.  EdCarp are always looking for new people willing to help, in any capacity; please sign up here if you would like to be kept updated and/or get involved: https://eepurl.com/gl4MsX.

 

Dealing with Data 2019 – Registration now open!

Collaboration Across the Nations: Managing, sharing and securing research data across space and time

UPDATE – DwD 2019 Postponed

Due to the strike action which is scheduled to happen on the 27th of November we have decided to postpone DwD2019. This was not a decision we took lightly but we felt it was for the best as we did not wish to put anyone in the uncomfortable position of feeling they had to either cross the picket line or not attend DwD2019. DwD2019 has been provisionally rebooked for the 15th of January 2020, any resulting changes to the programme or other details will be added here as and when they are confirmed.

Dealing with Data 2019 will take place from 09:30 – 16:15 on the 15th January 2020 in the Informatics Forum. This year our theme is “Collaboration Across the Nations: Managing, sharing and securing research data across space and time” and we are now inviting all staff and post-graduate students at the University of Edinburgh to register for this event.

Collaboration is vitally important to academic and commercial research in all areas as it enables the pooling of resources to answer increasingly complex, or interdisciplinary research questions.

The effective collection, processing, and sharing of research data is integral to successful collaborations, but it can also present many challenges. In particular the practicalities of co-ordinating data management across large multi-centre collaborations, sharing large data, or handling sensitive data, can present difficulties if not planned for appropriately.

Dealing with Data 2019 is your opportunity to hear from, and network with, other members of the UoE research community about how they have addressed these issues to build successful collaborations, or the lessons they have learned which will enable them to be more successful in the future.

In previous years DwD has attracted over 100 attendees from across the university to hear contributions by research staff and students at all stages of their careers and from diverse disciplines. You can view the presentations from 2017 & 2018 now on MediaHopper (https://media.ed.ac.uk/channel/Dealing+With+Data+2017+Conference/82256222)

Conference Programme – Dealing_with_Data_2019_Programme_V1.2

If you have any questions please get in touch using  dealing-with-data-conference@mlist.is.ed.ac.uk

Dealing with Data is an annual event sponsored and organised by the Research Data Service to provide a forum for University of Edinburgh researchers to discuss how they are benefiting from, or experiencing struggles with, the fast-changing research data environment.

Kerry Miller

Research Data Support Officer

Research Data Workshops: Sensitive Data Challenges and Solutions

This workshop at the Bioquarter was attended by 27 research staff representing all three colleges, with a majority of Medicine and Veterinary Medicine. It began with an introductory presentation from Robin Rice covering the new Data Safe Haven facility of the Research Data Service and and was followed by brief presentations from Lynne Forrest (Research Support Officer on Scottish Longitudinal Study); Fiona Strachan (Clinical Research Manager, Centre for Cardiovascular Science); and Jonathan Crook (Professor of Business Economics). Each speaker shared their experiences of both conducting research using sensitive data and supporting other researchers. Although they work with very different types of data it was easy to identify certain common requirements:

  • Easy access to secure data storage and analysis platforms;
  • Consistent & comprehensive training and guidance on working with sensitive data;
  • Support to meet the necessary requirements to gain access to the data they need;

In the discussion groups that followed, participants were asked about their experiences working with sensitive data, the requirements researchers needed services such as data safe havens to fulfil, and ramifications of the cost recovery model, with regard to including costs in grant proposals.

The major themes that emerged were concerns around training, data governance, and concerns about meeting costs for protecting sensitive data. There was a strong feeling that more and better training was required for all those working with sensitive data. There was also confusion about the number, location, and criteria of different Data Safe Havens now available, and no single place to find clear information on these.

When talking specifically about the Data Safe Haven offered by IS for UoE researchers, the biggest concern was around cost. The standard price was considered high for the majority of grants, which are either small or need to be highly competitive. In some disciplines grant funding is not common and so it is unclear how the costs would be able to be met. The Research Data Service representatives encouraged people to get a bespoke quote and discuss requirements with the team as early as possible, as flexibility on both cost and build specifications (e.g. high performance computing) is built-in.

Some specific points arising from the discussions were:

  • One negative experience about working with sensitive data is the length of time needed to get data approvals (e.g. from NHS bodies). Participants wondered if the University could help to speed those up.
  • More training was desired in sensitive data management and better ways to structure training for students.
  • Learning outcomes need to focus on change of behaviour; with focus on local procedures.
  • One participant felt that schools need a researcher portfolio system, some way of keeping track of who has what data. A suggestion was made to have an asset manager in the university, similar to the one in NHS.
  • Less than optimal security practices can be observed, such as leaving a clinical notebook in a coffee room. More training is needed but this is not fully covered in either clinical practice courses nor ethics.
  • There were concerns around data governance – how to set up gatekeepers for research projects using Data Safe Haven, how long to store things in the DataVault. ACCORD was pointed to for having good structure in data governance.
  • Long-running projects (e.g. ten years) would have trouble meeting the annual costs.
  • Projects are invested in locally run services and expertise; added value centralised services need to be low-cost.

Overall researchers were in favour of having a Data Safe Haven available for projects that need it, but they would also like to have support to correctly anonymise and manage their data so that they could continue to use standard data storage and analysis platforms. This would mean that only those with the most sensitive of data would need to rely upon the UoE DSH to conduct their research.

Those with a University log-in may read the full set of notes on the RDM wiki.

Kerry Miller
Research Data Support Officer
Library & University Collections