Reflections on IDCC 2015: How the 80/20 rule applies to Research Data Management tools

RDM Tools need to be usable by a defined and significant set of researchers

A key question for universities considering which RDM tools to adopt is how broadly useful each tool is and how widely it will be taken up by researchers across the institution. I was reminded of this when Robin Rice posed the question to Ian MacArdle and Torsten Reimer of Imperial College after listening to their presentation, ‘Green Shoots:  RDM Pilot at Imperial College London’.

This is equally an issue for builders of tools.  Some tools are designed to be used primarily by the people who designed them, to add value to their own research.  Most of the six projects described in the Green Shoots presentation fall into that category.  Other tools are designed to be used by a broader but still limited community, e.g. tools for researchers in a particular discipline.  RDM tools, in contrast, need to be applicable to an even broader range of researchers, certainly across more than one discipline.  I think the 80/20 rule is a good benchmark.  For an RDM tool to be of interest at the institutional level it needs to be relevant to the needs of at least 80% of a defined and significant set of researchers at the institution.

Electronic lab notebook example

To illustrate this point let’s take the example of electronic lab notebooks.  The target user group here includes biologists, chemists, biomedical researchers, and researchers in associated disciplines such as veterinary medicine, plant science, and food science.  In a word, disciplines where the paper lab notebook is widely used.  This is clearly a defined and significant set of researchers at major research institutions, so the target user group meets the minimum viable community definition.

What does an electronic lab notebook need to have and to do in order to meet the requirement of 80% of the users in this defined target group?  One way of approaching this issue is to look at the reactions of ‘outliers’, i.e.  researchers whose focus is so specific that a ‘generic’ offering, even one tailored to this particular set of researchers, may not be attractive.  A large set of potential outliers is chemists.  If there is no support for chemistry in an ELN chemists are likely to find it unsuitable for their needs.  So one challenge is to build in sufficient support for chemists without having the ELN become so chemistry-focussed that other segments of the target user group find it unworkable.  When this is done in most cases the majority of chemists go from being potential outliers to proponents of adoption.

A second set of outliers are bioinformaticians, who overlap with another set of outliers, researchers who build their own software as part of their research.  Both of these groups tend to prefer ‘home-built’ solutions, and for this reason may not be inclined to adopt ELNs that in fact work well for” the 80%”.  One way of increasing the likelihood that some members of these groups will make use of the ELN and hence become proponents is to build a good API.  This enables these groups to integrate the ELN with the software and solutions they are developing, in which case they are more likely to (a) appreciate the ELN’s benefits, and (b) find that it adds value to their workflow.

A third set of outliers are those who are reluctant to adopt an ELN because for whatever reason they do not see it fitting into their workflow.  In many cases this is a temporary barrier which can be overcome as the doubters come to understand the benefits of using an ELN.  In other cases there is a specific issue relating to the use of a paper notebook.  For example, some researchers work in labs where chemical spillage is likely.  They are reluctant to bring their own tablet computer into the controlled area because they are afraid it will be damaged — they will only adopt the ELN, for use on a tablet, if the lab provides a dedicated tablet for them to use in the lab.

Applying the 80/20 rule

Sometimes the doubts raised by outliers like the ones identified above take on a higher profile than the views of the silent majority of researchers who may be willing or even keen to try out and eventually adopt an ELN, but lack effective or well-connected advocates.  In this case the 80/20 rule may come in handy for those involved in RDM – take some soundings from the ‘average’ labs which don’t fall into any of the three categories of outliers noted above.  If there is significant interest among this group, that could be an indication that a product trial is justified, and that, notwithstanding doubts expressed from some quarters, the 80/20 rule is in fact in operation.

Making the right RDM tools available is a joint responsibility

Reflecting on the above, it’s clear that neither developers of RDM tools or those involved in RDM management can, working in isolation, provide the RDM tools that researchers need.  It’s up to tool providers to, first, design tools for the 80%, and then to be sensitive to those with specialist needs and attempt to build in capabilities that will make the tool useful to as many as possible of them, too.  And it’s up to those involved in RDM management to be attentive to the needs of the ‘silent majority’ of researchers and to work with them, and with interested tools providers, to offer researchers tools that meet their needs, even when those needs may not initially be systematically or even very coherently expressed.


Using an electronic lab notebook to deposit data into Edinburgh DataShare

This is heads up about a ‘coming attraction’.  For the past several months a group at Research Space has been working with the DataShare team, including Robin Rice and George Hamilton, to make it possible to deposit research data from our new RSpace electronic notebook into DataShare.

I gave the first public preview of this integration last month in a presentation called Electronic lab notebooks and data repositories:  Complementary responses to the scientific data problem  to a session on Research Data and Electronic Lab Notebooks at the American Chemical Society conference in Dallas.

When the RSpace ELN becomes available to researchers at Edinburgh later this spring, users of RSpace will be able to make deposits to DataShare directly from RSpace using a simple interface we have built into RSpace.  The whole process only takes a few clicks, and starts with selecting records to be deposited into DataShare and clicking on the DataShare button as illustrated in the following screenshot:b2_workspaceHighlightedYou are then asked to enter some information about the deposit:

c2_datashareDialogFilledAfter confirming a few details about the deposit, the data is deposited directly into DataShare, and information about the deposit appears in DataShare.

h2_viewInDatashare2We will provide details about how to sign up for an RSpace account in a future post later in the spring.  In the meantime, I’d like to thank Robin and George for working with us at RSpace on this exciting project.  As far as we know this is the first time an electronic lab notebook has ever been integrated with an institutional data repository, so this is a pioneering and very exciting experiment!  We hope to use it as a model for similar integrations with other institutional and domain-specific repositories.

Rory MacNeil
Chief Executive, Research Space