About Robin Rice

Data Librarian and Head, Research Data Support Library & University Collections

Data and ethics

As an academic support person, I was surprised to find myself invited onto a roundtable about ‘The Ethics of Data-Intensive Research’. Although as a data librarian I’m certainly qualified to talk about data, I was less sure of myself on the ethics front – after all, I’m not the one who has to get my research past an Ethics Review Board or a research funder.

The event was held last Friday at the University of Edinburgh as part of the project Archives Now: Scotland’s National Collections and the Digital Humanities, a knowledge exchange project funded by the Royal Society of Edinburgh. This event attracted attendees across Scotland and had as its focus “Working With Data“.

I figured I couldn’t go wrong with a joke about fellow ‘data people’ with an image from flickr that we use in our online training course, MANTRA.

Binary-by-Xerones-CC-BY-NC

‘Binary’ by Xerones on Flickr (CC-BY-NC)

Appropriately, about half the people in the room chuckled.

So after introducing myself and my relevant hats, I revisited the quotations I had supplied on request for the organiser, Lisa Otty, who had put together a discussion paper for the roundtable.

“Publishing articles without making the data available is scientific malpractice.”

This quote is attributed to Geoffrey Boulton, Chair of the Royal Society of Edinburgh task force which published Science as an Open Enterprise in 2012. I have heard him say it, if only to say it isn’t his quote. The report itself makes a couple of references to things that have been said that are similar, but are just not as pithy for a quote. But the point is: how relevant is this assertion for scholarship that is outside of the sciences, such as the Humanities? Is data sharing an ethical necessity when the result of research is an expressive work that does not require reproducibility to be valid?

I gave Research Data MANTRA’s definition of research data, in order to reflect on how well it applies to the Humanities:

Research data are collected, observed, or created, for the purposes of analysis to produce and validate original research results.

When we invented this definition, it seemed quite apt for separating ‘stuff’ that is generated in the course of research from stuff that is the object of research; an operational definition, if you will. For example, a set of email messages may just be a set of correspondences; or it may be the basis of a research project if studied. It all depends on the context.

But recently we have become uneasy with this definition when engaging with certain communities, such as the Edinburgh College of Art. They have a lot of digital ‘stuff’ – inputs and outputs of research, but they don’t like to call it data, which has a clinical feel to it, and doesn’t seem to recognise creative endeavour. Is the same true for the Humanities, I wondered? Alas, the audience declined to pursue it in the Q&A, so I still wonder.

“The coolest thing to do with your data will be thought of by someone else.”                          – Rufus Pollock, Cambridge University and Open Knowledge Foundation, 2008

My second quote attempted to illustrate the unease felt by academics about the pressure to share their data, and why the altruistic argument about open data doesn’t tend to win people over, in my experience. I asked people to consider how it made them feel, but perhaps I should have tried it with a show of hands to find out their answers.

Information Wants to Be Free

Quote by John Perry Barlow, image by Robin Rice

I swiftly moved on to talk about open data licensing, the choices we’ve made for Edinburgh DataShare, and whether offering different ‘flavours’ of open licence are important when many people still don’t understand what open licences are about. Again I used an image from MANTRA (above) to point out that the main consideration for depositors should be whether or not to make their data openly available on the internet – regardless of licence.

By putting their outputs ‘in the wild’ academics are necessarily giving up control over how they are used; some users will be ‘unethical’; they will not understand or comply with the terms of use. And we as repository administrators are not in a position to police mis-use for our depositors. Nevertheless, since academic users tend to understand and comply with scholarly norms about citing and giving attribution, those new to data sharing should not be unduly alarmed about the statement illustrated above. (And DataShare provides a ‘suggested citation’ for every data item that helps the user comply with the attribution requirements.)

Since no overview of data and ethics would be complete without consideration given to confidentiality obligations of researchers towards their human subjects, I included a very short video clip from MANTRA, of Professor John MacInnes speaking about caring for data that contain personally identifying information or personal attributes.

For me the most challenging aspect of the roundtable and indeed the day, was the contribution by Dr Anouk Lang about working with data from social media. As an ethical researcher one cannot assume that consent is unnecessary when working with data streams (such as twitter) that are open to public viewing. For one thing, people may not expect views of their posts outside of their own circles – they treat it as a personal communication medium. For another they may assume that what they say is ethereal and will soon be forgotten and unavailable. A show of hands indicated only some of the audience had heard of the Twitter Developers and API, or Storify, which can capture tweets and other objects in a more permanent web page, illustrating her point.

While this whole area may be more common for social researchers – witness the Economic and Social Research Council’s funding of a Big Data Network over several years which includes social media data – Anouk’s work on digital culture proves Humanities researchers cannot escape “the plethora of ethics, privacy and risk issues surrounding the use (and reuse) of social media data.” (Communication on ESRC Big Data Network Phase 3.)

Robin Rice
Data Librarian

New faces at the Data Library

We are pleased to introduce two new staff members who have joined the Data Library team.

Laine Ruus has taken up a six-month post as Assistant Data Librarian, helping out during Stuart Macdonald’s productive secondment at CISER, Cornell University. Laine has worked in data management and services since 1974, at the University of British Columbia, Svensk Nationell Datatjänst, and the University of Toronto. Laine was Secretary of IASSIST for eighteen years. She received the IASSIST Achievement award upon her retirement from the University of Toronto in 2010 and the ICPSR Flanigan Award in 2011.

She is perhaps best known for “ABSM: a selected bibliography concerning the ‘Abominable Snowman’, the Yeti, the Sasquatch, and related hominidae, pp. 316-334 in Manlike monsters on trial: early records and modern evidence, edited by Marjorie M. Halpin and Michael M. Ames. Vancouver: University of British Columbia Press, 1980.”

Pauline Ward, Data Library Assistant, will be contributing to the Data Library and Edinburgh DataShare services for University of Edinburgh students and staff, and helping to deliver new research data management services and training as part of the wider RDM programme. Pauline has a bioinformatics background, and has worked in a variety of roles from curation of the EMBL database at the European Bioinformatics Institute in Hinxton to database development (with Oracle, MySQL, Perl and Java) and sequence analysis at the Wellcome Trust Centre for Molecular Parasitology in Glasgow. She also worked more recently as a Policy Assistant at Universities Scotland.

Pauline said: “It’s great to be back in academia. I am really chuffed to be working to help researchers share their data and make the best use of others’ data. I’m really enjoying it.”

You can follow Pauline on twitter at @PaulineDataWard or check out her previous publications.

Pauline at her desk in the EDINA offices, Edinburgh

by Robin Rice and Pauline Ward
Data Library

New Data Curation Profile: Interdisciplinary Social Sciences in Health

Rowena Stewart, Academic Support Librarian, Information Services, has contributed a new data curation profile to the DIY RDM Training Kit for Librarians on the MANTRA website. Rowena was one of eight librarians at the University of Edinburgh to take part in local data management training.

Rowena has profiled data-related work by Nick Jenkins, Chancellor’s Fellow, Interdisciplinary Social Sciences in Health, School of Health in Social Science. In the interview Nick discusses ethical issues involved in sharing qualitative data, among other things.

Come work with us – Data Library Assistant post

Data Library Assistant

EDINA and Data Library, Information Services

£25,759- £29,837 per year
Full Time, Fixed Term: 36 months
Ref: 022330

The Data Library is working with others in Information Services to enhance and develop services to deliver the University’s Research Data Management programme. To this end the Data Library requires a member of the team to help us offer online and direct support for research data management planning and data curation, and to help raise awareness and provide training to staff and student researchers. office workersThe Data Library hosts Edinburgh DataShare, a research data repository for members of the University along with a data catalogue and a suite of research data support web pages within the University website. This is an excellent opportunity for a graduate to apply their research skills to a growing service area.

You will be a university graduate or have suitable relevant experience. You will be enthusiastic about new forms of scholarly communication such as open access publishing and open data, and working with open source software. You will be able to engage with peers in your discipline and help them to understand how good data management and sharing practices can improve their research and impact.

You will have research experience and data analysis skills as well as knowledge of publishing in an academic environment. You will have an understanding of university structures and norms.

Excellent written and verbal communication skills and up to date computer/Internet literacy is essential.

There are many advantages to working at the University. Benefits include flexible working, an excellent pension, career prospects and generous holiday provision.

Further details (please enter vacancy code 024399)

Closing Date: 29 January 2014

Contact Person: Ingrid Earp
Contact Number: +44 (0)131 651 1240
Contact Email: i.earp@ed.ac.uk