Greater Expectations? Writing and supporting Data Management Plans

“A blueprint for what you’re going to do”

This series of videos was arranged before I joined the Research Data Service team, otherwise I’d no doubt have had plenty to say myself on a range of data-related topics! But the release today of this video – “How making a Data Management Plan can help you” – provides an opportunity to offer a few thoughts and reflections on the purpose and benefits of data management planning (DMP), along with the support that we offer here at Edinburgh.

“Win that funding”

We have started to hear anecdotal tales of projects being denied funding due – in part at least – to inadequate or inappropriate data management plans. While these stories remain relatively rare, the direction of travel is clear: we are moving towards greater expectations, more scrutiny, and ultimately into the risk of incurring sanctions for failure to manage and share data in line with funder policies and community standards: as Niamh Moore puts it, various stakeholders are paying “much more attention to data management”. From the researcher’s point of view this ‘new normal’ is a significant change, requiring a transition that we should not underestimate. The Research Data Service exists to support researchers in normalising research data management (RDM) and embedding it as a core scholarly norm and competency, developing skills and awareness and building broader comfort zones, helping them adjust to these new expectations.

“Put the time in…”

My colleague Robin Rice mentions the various types of data management planning support available to Edinburgh’s research community, citing the online self-directed MANTRA training module, our tailored version of the DCC’s DMPonline tool, and bespoke support from experienced staff. Each of these requires an investment of time. MANTRA requires the researcher to take time to work through it, and took the team a considerable amount of time to produce in order to provide the researcher with a concise and yet wide-ranging grounding in the major constituent strands of RDM.  DMPonline took hundreds and probably thousands of hours of developer time and input from a broad range of stakeholders to reach its current levels of stability and maturity and esteem. This investment has resulted in a tool that makes the process of creating a data management plan much more straightforward for researchers. PhD student Lis is quick to note the direct support that she was able to draw upon from the Research Data Service staff at the University, citing quick response times, fluent communication, and ongoing support as the plan evolves and responds to change. Each of these are examples of spending time to save time, not quite Dusty Springfield’s “taking time to make time”, but not a million miles away.

There is a cost to all of this, of course, and we should be under no illusions that we are fortunate at the University of Edinburgh to be in a position to provide and make use of this level of tailored service, and we are working towards a goal of RDM related costs being stably funded to the greatest degree possible, through a combination of project funding and sustained core budget.

“You may not have thought of everything”

Plans are not set in stone. They can, and indeed should, be kept updated in order to reflect reality, and the Horizon 2020 guidelines state that DMPs should be updated “as the implementation of the project progresses and when significant changes occur”, e.g. new data; changes in consortium policies (e.g. new innovation potential, decision to file for a patent); changes in consortium composition and external factors (such as new consortium members joining or old members leaving).

Essentially, data management planning provides a framework for thinking things through (Niamh uses the term “a series of prompts”, and Lis “a structure”. As Robin says, you won’t necessarily think of everything beforehand – a plan is a living document which will change over time – but the important things is to document and explain the decisions that are taken in order for others (and your future self is among these others!) to understand your work. A good approach that I’ve seen first-hand while reviewing DMPs for the European Commission is to leave place markers to identify deferred decisions, so that these details are not forgotten about (This is also a good reason for using a template – a empty heading means an issue that has not yet been addressed, whereas it’s deceptively easy to read free text DMPs and get the sense that everything is in good shape, only to find on more rigorous inspection that important information is missing, or that some responses are ambiguous.)

“Cutting and pasting”

It has often been said that plans are less important than the process of planning, and I’ve been historically resistant to sharing plans for “benchmarking” which is often just another word for copying. However Robin is right to point out that there are some circumstances where copying and pasting boilerplate text makes sense, for example when referring to standard processes or services, where it makes no sense – and indeed can in some cases be unnecessarily risky – to duplicate effort or reinvent the wheel. That said, I would still generally urge researchers to resist the temptation to do too much benchmarking. By all means use standards and cite norms, but also think things through for yourself (and in conjunction with your colleagues, project partners, support staff and other stakeholders etc) – and take time to communicate with your contemporaries and the future via your data management plan… or record?

“The structure and everything”

Because data management plans are increasingly seen as part of the broader scholarly record, it’s worth concluding with some thoughts on how all of this hangs together. Just as Open Science depends on a variety of Open Things, including publications, data and code, the documentation that enables us to understand it also has multiple strands. Robin talks about the relationship between data management and consent, and as a reviewer it is certainly reassuring to see sample consent agreement forms when assessing data management plans, but other plans and records are also relevant, such as Data Protection Impact Assessments, Software Management Plans and other outputs management processes and products. Ultimately the ideal (and perhaps idealistic) picture is of an interlinked, robust, holistic and transparent record documenting and evidencing all aspects of the research process, explaining rights and supporting re-use, all in the overall service of long-lasting, demonstrably rigorous, highest-quality scholarship.

Martin Donnelly
Research Data Support Manager
Library and University Collections
University of Edinburgh

Research Data Service use cases – videos and more

Earlier this year, the Research Data Service team set out to interview some of our users to learn about how they manage their data, the challenges they face, and what they’d like to see from our service. We engaged a PhD student, Clarissa, who successfully carried out this survey and compiled use cases from the responses. We also engaged the University of Edinburgh Communications team to film and edit some of the user interviews in order to produce educational and promotional videos. We are now delighted to launch the first of these videos here.

In this case study video, Dr Bert Remijsen speaks about his successful experience archiving and sharing his Linguistics research data through Edinburgh DataShare, and seeing people from all corners of the world making use of the data in “unforeseeable” ways.

Over the coming weeks we will release the written case studies for internal users, and we will make the other videos also available on Media Hopper and YouTube. These will address topics including data management planning, archiving and sharing data, and adapting practices around personal data for GDPR compliance and training in Research Data Management. Staff and users will talk about the guidance and solutions provided by the Research Data Service for openly sharing data – and conversely restricting access to sensitive data – as well as supporting researchers in producing meaningful and useful Data Management Plans.

The team is also continuing to analyse the valuable input from our participants, and we are working towards implementing some of the helpful ideas they have kindly contributed.

Interning with the Research Data Service

For almost four months, I have been interning with the Research Data Service (RDS) as a project assistant. I decided to apply for the internship simply because I had received RDS support when I was developing a Research Data Management (RDM) plan for my PhD project and I also wanted to gain experience that would help me develop my professional skills. I was beyond thrilled when I was accepted!

Photo of a sunflower on a window ledge

The RDS team entry to the office sunflower-growing competition

The project I was involved in was called the Dealing With Data Use Case Videos Project. Its aim was to gain insights into the research data management (RDM) practice of data service users in all three Colleges at the University. My main role was to interview academic staff and PhD students as well as support staff about their experiences of RDM and their views on the tools available at the University such as DataStore, DataShare, DataVault and so on. The insights gathered from the interview are valuable for the RDS team to improve their services. For my personal development, interviewing the participants has helped me to gain my confidence and hone my skills which I can directly apply for my PhD research. I also enjoyed learning about different research projects beyond my field and felt inspired by the participants, particularly in how they share their data publicly to advance research on their topic. Another part of my internship (which I found most interesting) was to conduct video interviews. I had the chance to work directly with the Video Production Team of Communications and Marketing and visited their studio. This was my first experience being involved in video filming and editing.

Photo of nameplate on the desk which says Clarissa

My nameplate on my desk

So, my internship now has come to end, but I won’t forget this amazing experience. I was very welcomed to be part of the team, had my own desk, joined some meetings and even out for lunch and drinks! It’s been truly a pleasure to work with such a great team and I can’t thank the RDS team enough for the opportunity to learn so much about the RDS and to extend my knowledge about RDM.

Catherine Clarissa
Research Data Service Project Assistant
Postgraduate Research Student
Nursing Studies, University of Edinburgh

Photo of the office window view showing Edinburgh Castle obscured by a crane

The view from the office window

 

 

The Edinburgh DataShare Awards!

The Research Data Service team applauds those researchers at the University of Edinburgh who share their data. We therefore decided to show our appreciation by presenting awards to our most successful depositors, as part of the Dealing With Data conference. The prizes themselves do not come with a cash research grant attached unfortunately. However, the winners did receive a certificate bearing an image of our mascot for the day, Databot. We think you’ll agree the winning depositors and their data demonstrate the diversity of our collections, in terms of subject matter, formats and sheer size. We were particularly pleased with the reactions from both the recipients and the attendees, both in person, by email and on twitter (#UoEData was the Dealing with Data hashtag). Who doesn’t love the drama of an awards ceremony! A video is available.

Photograph of Pauline Ward announcing the award winners

Photo: CC-BY Lorna M. Campbell

The winners in full…

MOST DATASHARING SCHOOL: Edinburgh Medical School

– the School which boasts the greatest number of Edinburgh DataShare Collections currently. Thirty-three eligible Collections (already containing at least one dataset) such as “Connectomic analysis of motor units in the mouse fourth deep lumbrical muscle”, the Edinburgh Imaging “Image Library” and “Generation Scotland”.

MOST PROLIFIC DATASHARER: Professor Richard Baldock
– the most prolific depositor into Edinburgh DataShare for the academic year 2016-17, and over the lifetime of the repository, having shared a grand total of 1,105 data items with full metadata. These are grouped together into numerous Collections under the heading of “e-Mouse Atlas”. The majority of these detailed images show microscope slides of stained tissue, others are 3D models. They accompany a book and website published by Professor Baldock, building on the seminal work of Professor Matt Kaufman in developmental biology. The metadata for each of the slides links to a lower definition version within the e-Mouse Atlas website, where the data may be viewed and navigated in context. The original slides themselves are held by the University’s Centre for Research Collections.

detail of histological slide showing stained cells

Detail from Elizabeth Graham; Julie Moss; Nick Burton; Yogmatee Roochun; Chris Armit; Lorna Richardson; Richard Baldock. (2015). eHistology Kaufman Atlas Plate 21a image d, [image]. University of Edinburgh. College of Medicine and Veterinary Medicine. http://dx.doi.org/10.7488/ds/735.

MOST PROLIFIC DATASHARER (CSE): Professor Euan Brechin
– the depositor of the greatest number of Edinburgh DataShare items from the College of Science and Engineering in academic year 2016-2017. Euan deposits his coordination chemistry research data so frequently that we set up a Collection template on the Brechin Research Group, which automatically pre-populates some of the metadata fields for him, saving Euan time. If only we could find a way to mention metallosupramolecular cubes here.

The certificate awarded to Professor Euan Brechin

The certificate awarded to Professor Euan Brechin

MOST PROLIFIC DATASHARER (CAHSS): Dr Andrea Martin
– the depositor of the greatest number of Edinburgh DataShare items from the College of Arts, Humanities and Social Sciences in academic year 2016-2017. Some of these “Language Cognition and Communication” data items are still under temporary embargo. Users may nonetheless see all the metadata.

MOST POPULAR SHARED DATA: Professor Peter Sandercock
– the depositor of the Edinburgh DataShare item which has attracted the greatest number of page views over the lifetime of the repository: “International Stroke Trial database (version 2)” (aka IST-1).  These data from the International Stroke Trial provide a great example of how clinical trial data may be anonymised to allow them to be shared. For more information, you may want to watch Prof Sandercock’s very accessible and detailed  public lecture. Admittedly, one other item is higher up DataShare’s table of page views than IST. However we believe the traffic drawn by “RCrO3-xNx ChemComm 2016” to be artifactual, arising from the appearance of the word ‘doping’ in its abstract, and the fact the deposit was made at a time when doping in sport was very prominent in the news media. Additionally, the earlier, superseded, version of the IST-1 dataset also appears in the all-time top ten, and if we combine the number of views, it is in the No.1 spot outright 🙂

MOST POPULAR DATA 2016-17: Dr. Junichi Yamagishi
– the depositor of the Edinburgh DataShare item which has attracted the greatest number of page views (1,720 to be precise, as counted by Google Analytics) over the academic year 2016-17: “Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2015) Database”. Here’s the suggested citation, which DataShare compiles automatically, and displays prominently, to encourage users to cite the data:

Wu, Zhizheng; Kinnunen, Tomi; Evans, Nicholas; Yamagishi, Junichi. (2015). Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof 2015) Database, [dataset]. University of Edinburgh. The Centre for Speech Technology Research (CSTR). http://dx.doi.org/10.7488/ds/298.

MOST POPULAR DATA 2016-17 (CAHSS): Professor Miles Glendinning

– the depositor of the Edinburgh DataShare item from the College of Arts, Humanities and Social Sciences which has attracted the greatest number of page views (1,374 to be precise, as counted by Google Analytics), over the academic year 2016-17: “Hong Kong Public Housing Archive”. The Research Data Service is working closely with Miles, Personal Chair of Architectural Conservation, on a series of batch imports to put his fabulous array of photographs of public housing tower blocks from all around the world on DataShare over the coming months – keep an eye on DOCOMOMO International Mass Housing Archive.

Sunny image of the façade of several tower blocks; a tree is visible in the foreground.

Image cropped from “HKI_H_Yue_Fai_Ct.jpg” from Glendinning, Miles; Forsyth, Louise; Maxwell, Gavin; Wood, Michael. (2015). Hong Kong Public Housing Database, 2006-2015 [image]. University of Edinburgh. Edinburgh College of Art. http://dx.doi.org/10.7488/ds/322.

MOST POPULAR DATA 2016-17 (MVM): Dr. Tom Pennycott
– the depositor of the Edinburgh DataShare Collection page from the College of Medicine and Veterinary Medicine which has attracted the greatest number of page views over the academic year 2016-17: “Diseases of Wild Birds”. Hundreds of grotesquely beautiful photographs of dead wild birds, bodies ravaged with viruses, bacteria and protists, found at locations all around the United Kingdom; these images support the PhD thesis of Dr Tom Pennycott from our Veterinary School.

You can see usage statistics for any DataShare Item or Collection simply by clicking on the “View usage statistics” button on the right-hand-side of the page.

Pauline Ward, Research Data Service Assistant
EDINA and Data Library