Edinburgh DataShare receives ‘Data Seal of Approval’

Earlier this week DataShare received the Data Seal of Approval – a peer review certification for trusted digital repository (TDR) status. The award is reviewed every two-years.

Edinburgh DataShare self-assessment statements for each of the 16 metrics (which express roles and responsibilities of data producer, data repository and data consumer) can be viewed on the DSA website at: https://assessment.datasealofapproval.org/assessment_175/seal/pdf/ (note: liberal use of white space). We aim to publish the actual seal on the home page of DataShare as part of the upcoming major release (2.0).

For more information about DSA see our web page, http://www.ed.ac.uk/information-services/research-support/data-library/data-repository/trustworthiness

Note: a paper will be published in the forthcoming IASSIST Quarterly showcasing institutional implementations of DSA. This follows on from a successful panel session at the IASSIST Conference at Univ. Minneapolis in June (see: http://iassist2015.pop.umn.edu/program/block6#a4)

DSA are also currently in discussion with ICSU World Data System to produce a harmonised discipline-agnostic self-assessment TDR certification scheme. This should be in place some time in 2016.

Stuart Macdonald
Associate Data Librarian

Jisc Data Vault update

Posted on behalf of Claire Knowles

Research data are being generated at an ever-increasing rate. This brings challenges in how to store, analyse, and care for the data. Part of this problem is the long term stewardship of researchers’ private data and associated files that need a safe and secure home for the medium to long term.

PrintThe Data Vault project, funded by the Jisc #DataSpring programme seeks to define and develop a Data Vault software platform that will allow data creators to describe and store their data safely in one of the growing number of options for archival storage. This may include cloud solutions, shared storage systems, or local infrastructure.

Future users of the Data Vault are invited to Edinburgh on 5th November, to help shape the development work through discussions on: use cases, example data, retention policies, and metadata with the project team.

Book your place at: https://www.eventbrite.co.uk/e/data-vault-community-event-edinburgh-tickets-18900011443

The aims of the second phase of the project are to deliver a first complete version of the platform by the end of November, including:

  • Authentication and authorisation
  • Integration with more storage options
  • Management / monitoring interface
  • Example interface to CRIS (PURE)
  • Development of retention and review policy
  • Scalability testing

Working towards these goals the project team have had monthly face-to-face meetings, with regular Skype calls in between. The development work is progressing steadily, as you can see via the Github repository: https://github.com/DataVault, where there have now been over 300 commits. Progress is also tracked on the open Project Plan where anyone can add comments.

So remember, remember the 5th November and book your ticket.

Claire Knowles, Library & University Collections, on behalf of the JISC Data Vault Project Team

Research Data Alliance – report from the 6th Plenary

The Research Data Alliance or RDA is growing about as fast as the data all around us. It got off the ground in 2012 with the support of major research funders in Europe, the US and Australia and has since grown to over 3,000 members. The latest plenary in Paris set a new registration record of ~700 ‘data folk’ including data scientists, data managers, librarians and policy-makers. The theme was Enterprise Engagement with a focus on Research Data for Climate Change.

Not an ordinary conference

What sets RDA apart from other data-related organisations is not just the size of its gatherings, but its emphasis on making change. Parallel sessions are not filled with individual presentations of research papers, but of collaborative activities that lead to outputs that can be used in the real world. Working groups are approved by governance structures that coalesce around actual problems that cannot be solved by individual organisations but require new top-level approaches. They are required to produce their deliverables and close shop after an 18 month period. Interest groups are allowed to exist longer, but are encouraged to spin off working groups to address changes as they are identified through group discussion.

Hard-working groups

Since 2012, these working groups have produced some impressive deliverables and pilots that if implemented across the Web and across organisations and countries could speed up research and improve reproducibility. They are governed by an elected group of experts, worldwide. Some current active projects are:

  • Data Foundation and Terminology WG: defining harmonised terminology for diverse communities used to their own data ‘language’
  • Data Type Registries WG: building software to implement a DTR that can automatically match up unknown dataset ‘types’ with relevant services or applications (such as a viewer)
  • PID Information Types WG: Creating a single common API for delivering checksums from multiple persistent identifier service providers (DataCite and others)
  • Practical policy WG: building on a previous WG that collected various machine-actionable policies practiced by different data centres and repositories, this group will register the policies to move repository managers to move towards a harmonised set.
  • Scalable Dynamic Data Citation WG: to solve the difficulty of properly citing dynamic data sources, the recommended solution allows users to re-execute a query with the original time stamp and retrieve the original data or to obtain the current version of the data.
  • Data Description Registry Interoperability WG: to solve the problem of scattered datasets across repositories and data registries, the group build Research Data Switchboard linking datasets across platforms.
  • Metadata Standards Directory WG: By guiding researchers towards the metadata standards and tools relevant to their discipline, the directory drives up adoption of those standards, improving the chances of future researchers finding and using the data.

Members of the RDM team have been involved in library and repository-related interest groups and Birds of a Feather groups, where surveys of current practice have circulated.

Not all men at RDA! Dame Wendy Hall from the Web Science Institute leads a Women's Networking Breakfast

Not all men at RDA! Dame Wendy Hall from the Web Science Institute leads a Women’s Networking Breakfast – photo courtesy of @RDA_Europe

RDA and climate change

Climate science was prominent in the 6th RDA plenary. This was not only due to the imminent Paris-based United Nations COP talks, but indeed due to issues of critical importance for the world today. For some years, driven by the climate model inter-comparison work underpinning Intergovernmental Panel on Climate Change (IPCC) reports and the massive datasets from Earth observation climate science has been located at an intersection of high performance computing, big data management, and services to support and stimulate research, commerce, and governmental initiatives.

Assessment of the risks posed by climate change, and strategies for adaptation and mitigation sharpens the need to solve not only the technical problems of bringing together diverse data (social, soil, climate, land-use, commercial,…) but also to address the policy challenges, given the diverse organisations needing to cooperate. This is a domain that builds on services to give access to data, for computation close to data enabled by e-infrastructure (such as EGI), and one that requires ever stronger approaches to brokering these resources and services, to permit their orchestration and integration.

Among initiatives presented in the climate-related sessions were:

  • GEOSS – The GEOSS Common Infrastructure allows the user of Earth observations to access, search and use the data, information, tools and services available through the Global Earth Observation System of Systems
  • Global Agricultural Monitoring (GEOGLAM) initiative in response to the growing calls for improved agricultural information.
  • An RDS group focused on wheat – the volatility in prices, in part driven by climate unpredictability, has become a major concern.
  • The IPSL Mesocentre
  • IS-ENES developing services for climate modelling especially
  • Copernicus, seeking to “support policymakers, business, and citizens with improved environmental information. Copernicus integrates satellite and in-situ data with modeling to provide user-focused information services”
  • CLIPC will provide access to climate datasets, and software and information to assess indicators for climate impact.

Dr. Mike Mineter, School of GeoSciences and Robin Rice, EDINA and Data Library