Evelyn Williams, new Research Data Support Assistant

Hello, readers of Edinburgh Research Data Blog!

Last month I joined the University’s Research Data Service team as a Research Data Support Assistant, and I’m excited to be back at the University after three long years working as a data scientist at tech start-ups.

A photo of the author in Barcelona.

Me with a bag of churros in the Montjuïc, Barcelona, where I spent a few months in the winter of 2022 – Photo credit: Evelyn Williams

This career pivot from tech into collections management feels natural to me as a lifelong collector and cataloguer. An early memory is winning a Stanley plastic small parts organiser at a village tombola, the kind you’d use to store picture hooks and screws. I’d never seen a more magical object in my life. I began hunting for groups of items tiny enough to fit in the compartments like it was my life’s work. Elastic bands, our Labrador’s fur during moulting season, glittery hair beads (it was the early 2000s), woodlice. My favourite present from last birthday was a Dymo label maker. When I first read the description for this role, working to archive the University’s research data sounded like a dream come true. It’s especially exciting to be dipping my toe into data management at a university where RDM is already so well established, thanks to the work of Robin Rice and the many others involved in developing the department and the University’s data management policy.

I’ve been curious about archives and collections for a long time. I loved interning as a Collections Assistant in Special Collections at the Sir Duncan Rice Library in 2017 while I was an undergraduate Linguistics student at the University of Aberdeen. I helped run the reading room, assisted with manuscript conservation and digitising, and carried out archive research for the Library’s exhibition. Exploring the stacks of manuscripts and ephemera, I felt like the luckiest girl in the world. The highlight of my job was getting to see a volume of Audubon’s Birds of America (1827-1838). It was an incredibly special experience for lots of reasons – the sheer size of the book (it’s a meter tall!), the beauty of the illustrations, and the depictions of bird species that are now extinct. An example of an illustration of owls is shown below.

Barn owl illustration from Audubon's Birds of America.

Audubon, J. J. (1840) Barn Owl. The birds of America, plate CLXXI. New York, J.J. Audubon; Philadelphia, J.B. Chevalier. Photo credit: The John James Audubon Center at Mill Grove, Montgomery County Audubon Collection, and Zebra Publishing.

The photo below was included in the exhibition I worked on about medical innovation in wartime. So dramatic!

A photo of a nurse tying Sir Henry Gray’s surgical mask

A nurse tying Sir Henry Gray’s surgical mask. Photo credit: George Washington Wilson & Co. (1853 – 1908). DR GRAY ROYAL INFIRMARY ABERDEEN. [Photograph]. Aberdeen: The University of Aberdeen. GB 0231 MS 3792/D0500, George Washington Wilson & Co. photographic collection.

I’m thrilled to be back at the University and working with researchers again. The last time I worked here was three years ago as a Research Assistant while doing my master’s in Speech and Language Processing, helping researchers in the Centre for Speech Technology Research to evaluate audio processing models like computer-generated voices. I learned so much by being involved in lots of different research projects, and I’m looking forward to the huge scope of people and projects I’ll support in my new role.

That role was also where I first saw the potential of open data sharing. The University’s most accessed DataShare dataset was developed and shared by colleagues at CSTR, and has since been used and cited by research teams around the world, including at Google, Deepmind, and Meta as well as at countless universities. Making this speech data publicly available has contributed to big improvements in, for example, the speech devices used by many people with Motor Neurone Disease, and in the algorithms hearing aids use to make speech clearer.

Sharing your research data may sometimes seem like an afterthought to a project, but it can have a far-reaching impact and accelerate scientific progress. My hope is that in my new role I can help to further open research in a small way.

This photo from the TORGO project captures the process of recording facial movement during speech using an electromagnetic articulograph machine

Photo credit: The University of Toronto. (2012). Subject in AG500. The TORGO Database: Acoustic and Articulatory Speech From Speakers With Dysarthria. https://www.cs.toronto.edu/~complingweb/data/TORGO/torgo.html

This photo from the TORGO project captures the process of recording facial movement during speech using an electromagnetic articulograph machine. I used the TORGO dataset during my masters research, and I was grateful the researchers had published their data for academic use.

After I finished my masters I worked as a data scientist at a couple of tech start-ups, building artificial intelligence models. While I enjoy writing code and working on complex engineering projects, I didn’t like the restricted field of vision you have when you’re working to solve a narrow commercial problem. I’m happy to be in a more social role where I can support lots of different people and projects.

Photograph of a mug made by the author.

Some mugs I made for our most recent Open Studios event at Abbeymount Studios.

So far, the Research Data Service team has been really welcoming, and I feel lucky to be working with such knowledgeable and friendly people. I’ll be working 3.5 days a week with the RDS team, and on my other days I’ll likely be at the pottery studio, please see photo above, or reading. My collection of graphic novels is getting out of control, and I love fiction where nothing much happens but everything is just a bit unsettling. At the moment I’m trying to read everything by and about Shirley Jackson, as well as novels about disgruntled tech workers. Everyone I know is sick of me trying to get them to download the Libby app. (“It’s like Audible. But it’s FREE!”).

Evelyn Williams,

Research Data Support Assistant

Keith Munro, new Research Data Support Assistant

Hello, my name is Keith Munro and on March 4th 2024 I began my new role as a Research Data Support Assistant. Immediately prior to joining the Research Data Service (RDS), I studied for a PhD in Computer and Information Science at the University of Strathclyde. My thesis studied the information behaviour of hikers on the West Highland Way, see below for a photo of me during data gathering, with a particular focus on embodied information that walkers encountered, the classification of information behaviour in situ and well-being benefits resulting from the activity. I was lucky to present at the Information Seeking In Context conference in Berlin in 2022 and I am still working on getting a number of the findings from my thesis published in the months ahead. I passed my viva on Feb 2nd, so the timing of starting this job has been excellent.

Before my PhD, I studied for a MSc in Information and Library Studies, also from the University of Strathclyde, so there was always a plan to work in the library and information sector, but as my Masters degree was finishing during the outbreak of the Covid-19 pandemic in Spring/Summer 2020, I decided to take an interesting diversion, the scenic route, if you will, with the PhD! My Masters thesis was on the information behaviour of DJ’s, motivated by my own, lucky to do it but not exactly high-profile, experience as a DJ. From this, I was very fortunate to win the International Association of Music Librarians (UK & Ireland branch) E.T. Bryant Memorial Prize, awarded for a significant contribution to the literature in the field of music information. Subsequently, findings from this have been published in the Journal of Documentation and Brio.

Since starting my new role I have been greatly impressed by the team I have joined, who all bring a wealth of experience from across the academic spectrum and have also been very warm in welcoming me and in sharing knowledge. I hope I can bring my study and research experience to complement what the RDS team is doing and I am excited to be learning more about research data management. The size of the University of Edinburgh can be daunting and learning all the acronyms will take some time I suspect, but the range of research I have already encountered in reviewing submissions to DataShare has been fascinating, including Martian rock impacts and horse knees, something I’m sure will continue to be the case!

DataShare awarded CoreTrustSeal trustworthy repository status

CoreTrustSeal has recognised Edinburgh DataShare as a trustworthy repository.

What does this mean for our depositors? It means you can rest assured that we look after your data very carefully, in line with stringent internationally-recognised standards. We have significant resources in place to ensure your dataset remains available to the academic community and the general public at all times. We also have digital preservation expertise and well-planned processes in place, to protect your data from long-term threats. The integrity and reusability of your data are a priority for the Research Data Service.

Book to attend our practical “Archiving your Research Data” course

The certification involves an in-depth evaluation of the resilience of the repository, looking at procedures, infrastructure, staffing, discoverability, digital preservation, metadata standards and disaster recovery. This rigorous process took the team over a year to complete, and prompted a good deal of reflection on the robustness of our repository. We compiled responses to sixteen requirements, a task which I co-ordinated. The finished application contained over ten thousand words, and included important contributions from colleagues in the Digital Library team and from the university Digital Archivist Sara Thomson.

Our CoreTrustSeal application in full   

The CTS is a prestigious accreditation, held by many national organisations such as the National Library of Scotland, the UK’s Centre for Environmental Data Analysis and UniProt. Ours is the first institutional research data repository in the UK to receive the CoreTrustSeal (the Cambridge Crystallographic Data Centre has the CTS but, in contrast to DataShare, is a disciplinary repository which archives data from the international research community).

DataShare is a trustworthy repository, where you as a researcher (staff or student) at the University of Edinburgh can archive your research data free of charge. Bring us your dataset – up to 100 GB(!) – and we will look after it well, to maximise its discoverability and its potential for reuse, both in the immediate term and long beyond the lifetime of your research project.

Edinburgh DataShare

All CTS certified repositories

circular logo bearing a tick mark and the words 'Core Trust Seal'

The Research Data Support team has earned the right to display this CTS logo on the DataShare homepage

Pauline Ward
Research Data Support Assistant
Library & University Collections

Quicker, easier interface for DataVault

We are delighted to report that Edinburgh DataVault now has a quicker and easier process for users. The team have been working hard to overhaul the form for creating a new ‘vault’ to improve the user experience. The changes allow us to gather all the information we need from the user directly through a DataVault web page. The users no longer need to first login to Pure and create a separate dataset record there. Instead, that bit will be automated for them.

I have explained the new process and walked users through the new form in a new and updated how-to video, now combining the getting started information with the demo of how to create a vault:

Get started and create your vault! (8 mins)

The new streamlined process for users is represented and compared to our open research data repository DataShare in this workflow diagram. DataVault is designed for restricted access, but can also handle far larger datasets than DataShare.

The diagram shows the steps users go through in DataShare and DataVault. Common steps are deposit and approval.

The DataVault process includes the gathering of funding information, and review and deletion none of which are present in the DataShare workflow since they would not be relevant to that open research data repository.

The arrow showing DataVault metadata going to the internet represents the copying of selected metadata fields into Pure, where they are accessible as dataset records in the university’s Edinburgh Research Explorer online portal.

Our new course “Archiving Your Research Data”, featuring Sara Thomson, Digital Archivist, provides an introduction to digital preservation for researchers, combined with practical support on how to put digital preservation into practice using the support and systems available here at University of Edinburgh such as the DataVault. For future dates and registration information please see our Workshops page.

A recording of an earlier workshop (before the new interface was released) is also available: Archiving Your Research Data Part 1: Long-term Preservation.

If you are a University of Edinburgh principal investigator, academic, or support professional interested in using the Edinburgh DataVault, please get in touch by emailing data-support@ed.ac.uk.

Pauline Ward
Research Data Support Assistant
Library and University Collections
University of Edinburgh