Presenting the Data Vault

Blog post by University of Manchester project developer Tom Higgins:

Yesterday I gave a short presentation on the Data Vault project at an event in Lancaster:

https://www.eventbrite.co.uk/e/research-data-management-solutions-for-your-needs-tickets-17100593335

I based this on the original pitch with a few updates reflecting the work we’ve done over the last couple of months.

Here’s some of the feedback and questions from the event – I think a lot of these are more relevant for “phase 2 and beyond” than the current prototyping:

  • How does the Data Vault differ from iRODS? Perhaps the policy model from iRODS could be useful or iRODS could serve as a back-end. There was a comment that iRODS may be more useful where the researcher’s workflow is known and can be encoded into the system (e.g. it’s deeply involved in the day-to-day active data).
  • Archivematica (being explored by a project in York) can handle many preservation activities but has a specialist user interface which is not suitable for researchers to use directly. Perhaps a Data Vault could be used to ingest data and hand it over the Archivematica for preservation.
  • How would a Data Vault handle sensitive data? Would it be need to be certified? What if the “back-end” was using a certified storage system – would that ease the burden at all? I mentioned that perhaps both a “general” and a locked-down “sensitive” instance of the software could be run in parallel.
  • How could a Data Vault handle a dataset that is changing over time? Perhaps snapshots could be captured periodically – would this use a lot of storage space?
  • Could data be ingested from instruments automatically? I think this is an interesting one because the researcher will presumably want to access the data on active storage too (e.g. just ingesting into the vault isn’t particularly useful since you’d then need to pull it back out to actually work with the data, but you may want to have a frozen copy of the raw data too).
  • How could a Data Vault handle complex data e.g. from a database or an object store? In the simple case a user could export their data (e.g. in a backup format) and store that data (similar to how they might back up a database to a USB drive). Does it make sense for the a vault to try to understand complex data?

Here are some examples of “Active” and “Archive” systems which might be useful targets for integration:

  • Box
  • Hitachi Content Platform
  • DuraCloud
  • iRODS
  • Archivematica
Posted in Uncategorized | Comments Off on Presenting the Data Vault

Data Vault hackathon

The development model we chose for the Data Vault is to get us all in a room (Robin, Tom, Claire, Mary, Stuart) and to collaboratively develop the proof of concept system over a few days.  We were kindly hosted by the University of Manchester IT services in their Sackville Street building.

We started by looking at the skeleton framework that Tom and Robin had worked on, and then assigned areas of code to each person to write.  For example work was required on the user interface that the user sees, the broker in the middle that manages the system, and the backend workers that perform the archiving.

All of the code is stored openly in github, and is open source with an MIT license:

Data vault hackathon

Work is now continuing following the hackathon to complete a few areas of remaining code before the next Jisc Data Spring programme meeting where we can share the system with others.

Posted in Uncategorized | Comments Off on Data Vault hackathon

Fostering open science in social science

FOSTER_logoOn 10th of June, the Data Library team ran two workshops in association with the EU Horizon 2020 project, FOSTER (Facilitate Open Science Training for European Research), and the Scottish Graduate School of Social Science.

The aim of the morning workshop, “Good practice in data management & data sharing with social research,” was to provide new entrants into the Scottish Graduate School of Social Science with a grounding in research data management using our online interactive training resource MANTRA, which covers good practice in data management and issues associated with data sharing.

The morning started with a brief presentation by Robin Rice on ‘open science’ and its meaning for the social sciences. Pauline Ward then demonstrated the importance of data management plans to ensure work is safeguarded and that data sharing is made possible. I introduced MANTRA briefly, and then Laine Ruus assigned different MANTRA units to participants and asked them to briefly go through the units and extract one or two key messages and report back to the rest of the group. After the coffee break we had another presentation on ethics, informed consent and the barriers for sharing, and we finished the morning session with a ‘Do’s and Dont’s exercise where we asked participants to write in post-it notes the things they remembered, the things they were taking with them from the workshop: green for things they should DO, and pink for those they should NOT. Here are some of the points the learners posted:

DO
– consider your usernames & passwords
– read the Data Protection Act
– check funder/institution regulations/policies
– obtain informed consent
– design a clear consent form
– give participants info about the research
– inform participants of how we will manage data
– confidentiality
– label your data with enough info to retrieve it in future
– develop a data management plan
– follow the certain policies when you re-use dataset[s] created by others
– have a clear data storage plan
– think about how & how long you will store your data
– store data in at least 3 places, in at least 2 separate locations
– backup!
– consider how/where you back up your data
– delete or archive old versions
– data preservation
– keep your data safe and secure with the help of facilities of fund bodies or university
– think about sharing
– consider sharing at all stages. Think about who will use my data next
– share data (responsibly)

DON’T
– unclear informed consent
– a sense of forcing participants to be part of research
– do not store sensitive information unless necessary
– don’t staple consent forms to de-identified data records/store them together
– take information security for granted
– assume all software will be able to handle your data
– don’t assume you will remember stuff. Document your data
– assume people understand
– disclose participants’ identity
– leave computer on
– share confidential data
– leave your laptop on the bus!
– leave your laptop on the train!
– leave your files on a train!
– don’t forget it is not just my data, it is public data
– forget to future proof

Robin Rice presenting at FOSTERing Open Science workshop

Our message was that open science will thrive when researchers:

  • organise and version their data files effectively,
  • provide comprehensive and sufficient documentation for others to understand and replicate results and thus cite the source properly
  • know how to store and transport your data safely and securely (ensuring backup and encryption)
  • understand legal and ethical requirements for managing data about human subjects
  • Recognise the importance of good research data management practice in your own context

The afternoon workshop on “Overcoming obstacles to sharing data about human subjects” built on one of the main themes introduced in the morning, with a large overlap of attendees. The ethical and regulatory issues in this area can appear daunting. However, data created from research with human subjects are valuable, and therefore are worth sharing for all the same reasons as other research data (impact, transparency, validation etc). So it was heartening to find ourselves working with a group of mostly new PhD students, keen to find ways to anonymise, aggregate, or otherwise transform their data appropriately to allow sharing.

Robin Rice introduced the Data Protection Act, as it relates to research with human subjects, and ethical considerations. Naturally, we directed our participants to MANTRA, which has detailed information on the ethical and practical issues, with specific modules on “Data protection, rights & access” and “Sharing, preservation & licensing”. Of course not all data are suitable for sharing, and there are risks to be considered.

In many cases, data can be anonymised effectively, to allow the data to be shared. Richard Welpton from the UK Data Archive shared practical information on anonymisation approaches and tools for ‘statistical disclosure control’, recommending sdcMicroGUI (a graphical interface for carrying out anonymisation techniques, which is an R package, but should require no knowledge of the R language).

DrNiamhMooreFinally Dr Niamh Moore from University of Edinburgh shared her experiences of sharing qualitative data. She spoke about the need to respect the wishes of subjects, her research gathering oral history, and the enthusiasm of many of her human subjects to be named in her research outputs, in a sense to own their own story, their own words.

Links:

Rocio von Jungenfeld & Pauline Ward
EDINA and Data Library

Posted in Uncategorized | Comments Off on Fostering open science in social science

ArchivesSpace at the University of Edinburgh – the techie side

Introducing ArchivesSpace for researchers and public users, as well as the administrative side for our Archives Team within the Centre for Research Collections, has been an ongoing project for the last 18 months. It has taken us a while to get the service live for a number of reasons and we have learnt lots along the way.

ArchivesSpace is free open source software and is easy to set-up using Jetty and MySQL, however some of our requirements have meant getting to grips with the underlying set-up and APIs of the system. We have also joined ArchivesSpace as paid members as this enables us to get additional support through documentation and mailing lists.

Import of authority controls
We had an existing MySQL database containing thousands of authority terms collected by the Archives Team. It was very important for us to keep these and import them into our ArchivesSpace instance. We imported the subjects using the ArchivesSpace API. Learning how to use the API was made easier by the Hudson Molonglo Youtube videos. We have written simple PHP scripts to allow us to connect to the ArchivesSpace backend and import the subjects and agents from MySQL database exports of our existing authority terms. After some trial and error we have imported 9275 subjects and 13703 agents into ArchivesSpace.

For a while the authorities were not linking with the  resources migrated into ArchivesSpace by the Archives Team,  via the EAD importer. To enable the authorities to link we had to make modifications to the EAD importer in the plugins. The changes are available to view on our Github code repository. We also made changes to the importer to allow us to get a greater understanding of why EAD imports were failing. The reasons why EAD failed to import have changed as new versions of ArchivesSpace were released and the EAD importer is quite strict. The Archives Team migrated 16836 resources (including components) for launch on 9th June.

API for other things
We have also used the API to run through all resources imported from EAD and publish them. By default they were not all published and a lot of the notes and details of the resources were hidden from the public interface. Therefore being able to script the publishing was a great time saver.

Tomcat set-up
We decided to run ArchivesSpace under Tomcat as it is a web server that we have a lot of experience with. However, ArchivesSpace runs easily under Jetty and running it under Tomcat has caused us some headaches, due to URLs issues and the fact that the Tomcat installation script adds a lot of files to Tomcat and not just the web apps.

Customisation
We have customised the user interface for the administrative and public front ends of ArchivesSpace. These changes were made within the local plugin. The look and feel has been made to fit in with our other services such as collections.ed and the colour scheme of the University. This was relatively straightforward as ArchivesSpace UI is based on Twitter Bootstrap. Unfortunately the public UI images were displaying when running in Jetty but not in Tomcat. After some copying of files the images appeared.

ArchivesSpace at University of EdinburghThe Public ArchivesSpace Portal http://archives.collections.ed.ac.uk

Early Adopters
It has taken longer than we had initially hoped to launch ArchivesSpace for a number of reasons. Primarily as early adopters of software there were issues that we did not foresee when the initial version was made available. The ArchivesSpace members mailing list is very active, as it is a new system there are lots of shared questions from those getting to grips with the system and working through their implementation.  ArchivesSpace, particularly Chris Fitzpatrick, have helped steer us in the right direction and shared code. The migration of EAD has been a huge task that has been undertaken by Deputy Archives Manager, Grant Buttars, it has been great to work with him and to get a greater understanding of the format of EAD when resolving issues with failing imports.

We still have lots to do with the system to leverage its full functionality and fully showcase our amazing archives collection through links to http://collections.ed.ac.uk and our image repository. So watch this space.

This post follows on from Grant’s post https://libraryblogs.is.ed.ac.uk/edinburghuniversityarchives/2015/06/22/implementing-archivesspace/

Claire Knowles
Library Digital Development Team

Posted in Uncategorized | Tagged , , , | Comments Off on ArchivesSpace at the University of Edinburgh – the techie side

Instrumental Challenges

0031996d

Last week saw the start of a new project- photographing many of the University’s Musical Instruments while they are in storage at the Library during the re-development of St. Cecilia’s Music Hall. These images are planned for use in the new museum space, in printed materials, for social media and interactive Apps. The only guidance we have been given is ‘coffee-table book’ which gives the DIU team huge scope for interpretation and creativity. As the project progresses we hope to bring 3D photography into the mix, but for starters, this week the musical instruments team brought me 3 items for some studio shots.

The first was a Triple-fretted clavichord, possibly Flemish and c1620 (ref. 4486). Although this piece was quite simple and unadorned, it did have a bright red ribbon woven through the strings and the keys made a beautiful pattern, so I decide on a detail shot to highlight the mechanism.

0067016d

The second item was a Rahab from Western Malaysia, c1977 (ref. 2101). This was a far more ornate and colourful piece. In fact, I was torn- both the front and back of the instrument presented interesting features to photograph, but how to get both sides at once? While at the Rijksmuseum conference Malcolm and I were impressed by their use of a black reflective surface in the photography of fashion accessories (see https://www.rijksmuseum.nl/formats/accessoires/index.jsp?lang=en). Malcolm suggested that we might be able to get a similar effect using a piece of black velvet and some glass, so I set up the studio to try it out. In the end I chose an angle looking down on the instrument that allowed details of both the strings and the red woollen back to be seen, however, the reflection adds further interest to the shot.

The final piece presented quite a different challenge. It is very rare that an object comes to us that leaves me scratching my head, but the ‘Jingling Johnny’ or Chapeau (ref. 6110) certainly did. A large, top heavy shiny brass instrument covered with dangling bells and fragile metalwork set atop a stick- how to keep it upright and perfectly still? The many shiny surfaces indicate that we will need to build a light tent to minimise reflections. This was clearly going to require some thought and planning, so we reluctantly decided to return this one to the store to reconvene another day!

In the coming months we will keep you posted on the projects progress.

Susan Pettigrew, Photographer

Posted in Edinburgh College of Art, LLC general, Museum Collections, News, Projects, School of History, Classics and Archaeology | Comments Off on Instrumental Challenges

Implementing ArchivesSpace

Meeting our Needs

Since the early 2000s we have been looking for suitable software to manage our archives in a holistic manner. We began to deliver online catalogues at this time via various project initiatives, with metadata encoded as EAD/xml, but this only dealt with resource discovery and was quite cumbersome. Moreover, along with other digital developments, the work inhabited one of a number of parallel silos.

As time moved on, we got better at developing systems to move different elements of work from the analogue to the digital but were still some way off developing or finding a comprehensive, robust and sustainable way to join things up in a meaningful way. This changed when we began to investigate Archivists’ Toolkit in 2011. Although we had looked at it in one of its earlier versions, we were surprised to see how much subsequent developments had brought it quite close to ticking everything on our wish list. It was lacking a resource discovery layer but a successor product, ArchivesSpace, was already planned and would include this.

From Archivists’ Toolkit to ArchivesSpace

We therefore began looking at Archivists’ Toolkit in more detail, assessing issues such as functionality and usability but also those of sustainability and interoperability. It scored very highly, high enough for us to be able to make the business case to commit to ArchivesSpace and obtain the internal funding to sign up as Members.

The involvement of the profession in the development of ArchivesSpace has been and continues to be crucial. What has been developed is not just other people’s idea of what the product needs to be but what we as archivists actually require. Although heavily influenced by the predominant US partners and the specifics of US practice, it has been developed in way that is equally intelligible to others and easily customisable to reflect local needs and terminology.

Priorities and Impact

We originally focused on moving our behind-the-scenes work over but then switched to frontloading our resource discovery, migrating existing EAD xml files and also retro-converting a wide range of old spreadsheets, databases and similar. In terms of impact, this both provides evidence that our business case was sound but, most importantly, meets growing user expectations of what and online catalogue should deliver.

Phase one saw the delivery of nearly 17,000 catalogue records along with over 22,000 authority terms. We still have more to add, along with a whole range of management metadata about accessioning, locations etc. This will feature in Phase 2.

Because the source metadata has been drawn from a variety of legacy sources, there are issues of consistency and quality to be addressed. These are outstanding issues which could never be solved just by getting the metadata into ArchivesSpace. However, with all the metadata now in one place we can now look to quantify and rectify them. Experience told us that’s users would often rather have partial metadata rather than no metadata at all so we chose to go for a warts and all approach, only correcting what was obviously erroneous at this stage.

Community and Participation

We are proud to have signed up as the first European partner and the support we have had from a growing community of ArchivesSpace users and developers. This discussion is also two-way, with us feeding ideas back for future development.

Locally we are also more fully integrated into developing solutions that deliver all our collections online, through a suite of applications and interface that work together, improving user experience and improving how we manage the collections themselves.

Next Steps

We still have lots to do with the system to leverage the full functionality of the system and fully showcase our amazing archives collection. So watch this space.

View the online catalogue.

Read about this from a technical perspective

Posted in Uncategorized | Comments Off on Implementing ArchivesSpace

Longforgan Free Church Ministers Library now catalogued online

I’m pleased to report that the Longforgan Free Church Ministers Library has now been catalogued online. A big thank you to the two project cataloguers who have tackled this collection, Finlay West and Patrick Murray!

Standard documents connected with the Free Church of Scotland ... / issued by the authority of the Publication Committee of the General Assembly. Edinburgh : John D. Lowe, 69, George Street, 1847. New College Library LON 864

Standard documents connected with the Free Church of Scotland … / issued by the authority of the Publication Committee of the General Assembly. Edinburgh : John D. Lowe, 69, George Street, 1847. New College Library LON 864

Read More

Posted in Featured, Library | Tagged , , | Comments Off on Longforgan Free Church Ministers Library now catalogued online

Let me introduce myself!

I am Thais, a first year conservation student at Northumbria University and I am now part way through a four week placement at the conservation studio in CRC (Centre for Research Collections) – Main Library of Edinburgh University.

I have a bachelor degree in art during which time I developed an enormous interest in conservation. I had the opportunity to undertake various placements which helped me to decide that I wished to pursue a career in paper conservation. This led me to enroll in a course in the conservation of documents and graphic material in Brazil. I worked primarily in preventive conservation where I was always looking for further opportunities to improve my knowledge and professional skills. This has subsequently led me to the UK to study for my Masters in the Conservation of Fine Art on Paper at Northumbria University, and my current placement at the CRC.

Thais

Thais carrying out treatment in the Conservation Studio

Apart from all the people I`ve met at the CRC and their amazing passion for showing and talking about their work, I was presented with the Thomson-Walker print collection; a group of 2500 prints collected by the surgeon Sir John William Thomson-Walker that were hinged or partially fixed on to poor quality backing boards.

A brief condition report for each print was made before any conservation treatment was undertaken. Once the appropriate documentation had been completed, the hinges could be cut using a scalpel separating the prints from boards.The prints were then cleaned using chemical sponge, carefully removing any debris and dirt. Surface cleaning is an important conservation procedure not only for aesthetic reasons but also to remove material that may cause abrasion, acidity and attract insects (e.g. food or mould residue).

Once the prints were surface cleaned, the paper and adhesive used on the hinges could then be removed. Samantha Cawson – during her internship in the beginning of this year – started this project and tried different approaches to remove the adhesive from the prints. She observed that a carboxymethyl cellulose poultice, interlayered with tissue paper, would be most effective on the range of adhesives present in this collection. The poultice technique consists of applying a small amount of moisture on a specific area, in this case to soften the water based adhesive thus allowing the hinges to be removed with a metal spatula or a scalpel.

For those prints that were glued directly onto a backing board, I was able to reduce the board to a fine layer using a scalpel. This allowed the poultice to be placed under light weight in the area where there was adhesive as before but for a few more minutes.

So far, this has been my first two weeks at the conservation studio and I`m glad to say that the prints treated look much better.

I`m very happy to have the opportunity to practice and improve my skills at the CRC and observe the distinct answers to the treatment on varied medias and supports. I`m excited about the upcoming projects that I will work on during my stay as well as the chance to see some of the the amazing work develop in the conservation studio.

Post by Thais Biazioli, Conservation Student Placement

Posted in CRC, Featured, Library & University Collections | Tagged , , , | Comments Off on Let me introduce myself!

Trial access to History of Science and Medicine Library E-Books

logoWe have trial access to the History of Science and Medicine Library e-book collection from Brill until 9th July.

The History of Science and Medicine Library is a peer-reviewed book series devoted to the history of science and medicine, both history of scientific theory as well as the history of the role of science in society and culture from early modern times to the present. The medical studies include medical theory and practice as well as medicine and society.

A list of the titles on trial is available here.

 

Feedback and further info

We are interested to know what you think of this e-book collection as your comments influence purchase decisions so please fill out our feedback form.

A list of all trials currently available to University of Edinburgh staff and students can be found on our trials webpage.

Posted in Library, Online library resources, Trials | Tagged , , , | Comments Off on Trial access to History of Science and Medicine Library E-Books

Follow @EdUniLibraries on Twitter

Collections

Default utility Image Hill and Adamson Collection: an insight into Edinburgh’s past My name is Phoebe Kirkland, I am an MSc East Asian Studies student, and for...
Default utility Image Cataloguing the private papers of Archibald Hunter Campbell: A Journey Through Correspondence My name is Pauline Vincent, I am a student in my last year of a...

Projects

Default utility Image Cataloguing the private papers of Archibald Hunter Campbell: A Journey Through Correspondence My name is Pauline Vincent, I am a student in my last year of a...
Default utility Image Archival Provenance Research Project: Lishan’s Experience Presentation My name is Lishan Zou, I am a fourth year History and Politics student....

Archives

Subscribe to Blog via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.