Home University of Edinburgh Library Essentials
February 11, 2025
In my last blog post, I looked at the four quadrants of research data curation systems. This categorised systems that manage or describe research data assets by whether their primary role is to store metadata or data, and whether the information is for private or public use. Four systems were then put into these quadrants.
The University of Edinburgh already has two active services from this diagram: PURE our Current Research Information System, and DataShare our open data repository.
This blog post will start to unpack some of the requirements for a Data Asset Register.
The first aspect to cover is its name. What should it be called? Traditionally systems like this, which only hold metadata records that either just describe, or describe and point to other resources, are known as registers, catalogues, directories, indexes, or inventories.
The University already has a ‘Data Catalogue’, maintained by the Data Library. However this list has a different purpose, to hold details of external data. Oxford University, instead of opting for a name such as this, have instead opted to call their service by the verb ‘find’ – DataFinder. Whilst there may be some brand or service name applied to the system we create at the University of Edinburgh, for now its working title is ‘Data Asset Register’ as one of its main functions will be to allow data creators to ‘register’ their data assets by describing them, and if the data is published online to link to the data.
But what should the Data Asset Register provide? The following diagram shows some early thoughts:
The diagram splits this up into three broad areas:
The core purpose of the system is to describe data. This is split into two categories: being able to describe single items or data assets, and describing collections of data assets. Many data assets are created on their own, for example a population health longitudinal study. As such, this should be described on its own. In contrast, some data are created in large sets, where it isn’t necessarily useful to describe every part of that set on its own. In this case, the collection as a whole can be described. A good example of this is the Research Data Australia service from the Australian National Data Service.
We’ll need to decide how to describe the data. A likely initial candidate will be the DataCite Metadata Schema, but we may find this needs to be extended to cover extra elements relevant to the University or the discipline of the data asset being described. There will also be requirements coming from a possible UK research data registry development of which is being led by the Digital Curation Centre.
In order to enable data asset description, a register will need certain functions. So far three have been identified:
“Research organisations will ensure that EPSRC-funded research data is securely preserved for a minimum of 10-years from the date that any researcher ‘privileged access’ period expires or, if others have accessed the data, from last date on which access to the data was requested by a third party;”
It may also be that the Data Asset Register can be a front-end for our Data Vault too – more about that in another blog post!
Extra value-added services are required in order to make the Data Asset Register useful to people. Our initial thoughts about these services include the following:
It is very early days in our thinking about what features a Data Asset Register should offer, and like many components of a modern research data management infrastructure, there are very few existing examples to look at. Our thoughts will be refined over the coming months so that we can start looking at implementation options. Is there an existing system that can do all of this for us, or is it better to build something new, either alone or with a collaborators?
Images available from http://dx.doi.org/10.6084/m9.figshare.873617