DataVault – larger deposits and new review process notifications

New deposit size limit: 10TB

Great news for DataVault users: you can now deposit up to a whopping ten terabytes in a single deposit in the Edinburgh DataVault! That’s five times greater than the previous deposit limit, saving you time that might have been wasted splitting your data artificially and making multiple deposits.

It’s still a good idea to divide up your data into deposits that correspond well to whatever subsets of the dataset you and your colleagues are likely to want to retrieve at any one time. That’s because you can only retrieve a single deposit in its entirety; you cannot select individual files in the deposit to retrieve. Smaller deposits are quicker to retrieve. And remember you’ll need enough space for the retrieved data to arrive in.

We’ve made some performance improvements thanks to our brilliant technical team, so depositing now goes significantly faster. Nonetheless, please bear in mind that any deposit of multiple terabytes will probably take several days to complete (depending on how many deposits are queueing and some characteristics of the fileset), because the DataVault needs time to encrypt the data and store it on the tape archives and into the cloud. Remember not to delete your original copy from your working area on DataStore until you receive our email confirming that the deposit has completed!

And you can archive as many deposits as you like into a vault, as long as you have the resources to pay the bill when we send you the eIT!

A reminder on how to structure your data:
https://www.ed.ac.uk/information-services/research-support/research-data-service/after/datavault/prepare-datavault/structure

 Ensuring good stewardship of your data through the review process

Another great feature that’s now up and running is the review process notification system, and the accompanying dashboard which allows the curators to implement decisions about retaining or deleting data.

Vault owners should receive an email when the chosen review date is six months away, seeking your involvement in the review process. The email will provide you with the information you need about when the funder’s minimum retention period (if there is one) expires, and how to access the vault. Don’t worry if you think you might have moved on by then; the system is designed to allow the University to implement good stewardship of all the data vaults, even when the Principal Investigator (PI) is no longer contactable. Our curators use a review dashboard to see all vaults whose review dates are approaching, and who the Nominated Data Managers (NDMs) are. In the absence of the Owner, the system notifies the NDMs instead. We will consult with the NDMs or the School about the vault, to ensure all deposits that should be deleted are deleted in good time, and all deposits that should be kept longer are kept safe and sound and still accessible to all authorised users.

DataVault Review Process:
https://www.ed.ac.uk/information-services/research-support/research-data-service/after/datavault/review-process 

The new max. deposit size of 10 TB is equivalent to over five million images of around 2 MB each – that’s one selfie for every person in Scotland. Image: A selfie on the cliffs at Bell Hill, St Abbs
cc-by-sa/2.0 – © Walter Baxter – geograph.org.uk/p/5967905

Pauline Ward
Research Data Support Assistant
Library & University Collections

DataVault user roles let you share access to archived data

The Edinburgh DataVault is a secure long-term retention solution for research data.

Thanks to the hard work of our software developers in the Digital Library and EDINA, the Edinburgh DataVault now facilitates five different user roles. This means busy PIs can delegate the work of depositing and retrieving data, to members of their team or other collaborators within the University. It also allows PIs to nominate support staff to deposit and retrieve data on their behalf, or grant access to new members of their team.

Diagram representing a PI and two postdocs using the roles of Owner and Nominated Data Manager to share access to data in the DataVault

There are five user roles:

  • Data Owner
    Usually the Principal Investigator. Can add/remove other users to their vault(s).
  • Nominated Data Manager (of a given vault)
    Can view and edit metadata fields, deposit data and retrieve any deposit in the vault. May add/remove Depositors to the vault.
  • Depositor (of a given vault)
    Can view the vault contents, deposit data and retrieve any deposit in the vault.
  • School Support Officer
    Acting on behalf of the Head of School, may view all vaults and associated deposits belonging to the School.
  • School Data Manager
    Assigned only with the express permission of the Head of School, may view, deposit into and retrieve data from any vault belonging to the School.

Full details of the permissions associated with each role:
Roles and permissions

Support staff who need to view reporting data for their School, or admin access to their School’s vaults, should attend our training – Edinburgh DataVault: supporting users archiving their research data.

Further information on why and how to use the DataVault is available on the Research Data Service website:
DataVault long-term retention

If you have any questions about using DataVault please don’t hesitate to contact the Research Data Support team at data-support@ed.ac.uk.

Pauline Ward, Research Data Support Assistant
Library and University Collections
@PaulineData

New research data management tool on one-year trial: protocols.io

Information Services aims to offer a research data service that meets most of the data lifecycle needs of the majority of UoE researchers without interfering with their freedom to choose tools and technologies which suit their work. In some cases cloud tools that are free to individual users are offered commercially as enterprise versions, allowing groups of researchers (such as lab groups) to work together efficiently.

The service’s steering group has agreed a set of criteria to apply when a tool is put forward by a research group for adoption. The criteria were developed after our two-year trial of the electronic lab notebook software, RSpace, and have been most recently applied to protocols.io. The protocols.io trial begins this month and will run for one year. An evaluation will determine whether to continue the enterprise subscription and how to fund it.

protocols.io is an online platform for the creation, management, and sharing of research protocols or methods. Users can create new protocols within the system, or upload existing methods and digitise them. Those with access to a protocol can then update, annotate, or fork it so that it can be continually improved and developed. There is interoperability with Github and RSpace, and long-term preservation of protocols through CLOCKSS.

Users can publish their protocol(s) making them freely available for others to use and cite or, with the enterprise version, keep them private. The tool supports the Open Science / Open Research agenda by helping to ensure that methods used to produce data and publications are made available, assisting with reproducibility.

Subscribing to the University plan will allow research groups to organize their methods and ensures that knowledge is not lost as trainees graduate and postdoctoral students move on. There are currently over 70 University of Edinburgh researchers registered to use protocols.io. You may follow these instructions to move your current protocols.io account to the premium university version. For more information contact data-support@ed.ac.uk.

Kerry Miller and Robin Rice
Research Data Support team

Research Data Service achieves ISO 27001 accreditation for Data Safe Haven facility

Following a five day on-site audit by Lloyd’s Register, the Information Security Management System (ISMS) which forms the basis for the Data Safe Haven facility for University of Edinburgh researchers has been officially certified to the ISO/IEC 27001:2013 standard. In a few weeks we will receive a certificate from UKAS (United Kingdom Accreditation Service).

The Data Safe Haven (DSH) team, comprised of members of Research Data Support in L&UC and Research Services in ITI, and with input from the Information Security team and external consultants, has been working toward certification since 2016. The system, designed by ITI’s Stephen Giles, has been extensively and successfully ‘white box penetration tested’ by external experts, one of the many forms of proof provided to the auditor. (White box means the testers were given access to certain layers of the system, as opposed to a black box test where they are not.)

The steel cage surrounding Data Safe Haven equipment in one of the University data centres.

In addition to infrastructure, a proper ISMS is made up of people who perform roles and manage procedures, based on organisational policies. The Research Data Support team work with research project staff to ensure their practices comply with our standard operating procedures. The ISMS is made up of all the controls needed to ensure that it is sensibly protecting the confidentiality, availability, and integrity of assets from threats and vulnerabilities. Over 150 managed and versioned documents covering every aspect of the ISMS were written, discussed, practiced, reviewed and signed off before being examined and questioned by the auditor.

The auditor stated in the final report, “The objectives of the assessment were achieved and with consideration to any noted issues or raised findings, the sampled areas of the management system demonstrated a good level of conformance and effectiveness. The management system remains supportive of the organisation and its business and service management objectives.” On a slightly more upbeat note, Gavin Mclachlan, Vice-Principal and Chief Information Officer, and Librarian to the University said by email, “Congratulations to you and the whole team on the ISO 27001 certification. That is a great achievement.”

The Digital Research Services programme has invested in the Data Safe Haven to allow University researchers to conduct cutting edge research, access sensitive data from external providers and facilitate new research partnerships and innovation. Researchers are expected to include Data Safe Haven costs in funded grant proposals to achieve some cost recovery for the University. To find out if your project is a candidate for use of the Data Safe Haven contact data-support@ed.ac.uk or the IS Helpline.

Robin Rice
Data Librarian and Head, Research Data Support
L&UC