Data Visualisation with D3 workshop

Last week I attended the 4th HSS Digital Day of Ideas 2015. Amongst networking and some interesting presentations on the use of digital technologies in humanities research (the two presentations I attended focused on analysis and visualisation of historical records), I attended the hands-on `Data Visualisation with D3′ workshop run by Uta Hinrichs, which I thoroughly enjoyed.

The workshop was a crash course to start visualising data combining d3.js and leaflet.js libraries, with HTML, SVG, and CSS. For this, we needed to have installed a text editor (e.g. Notepad++, TextWrangler) and a server environment for local development (e.g. WAMP, MAMP). With the software installed beforehand, I was ready to script as soon as I got there. We were recommended to use Chrome (or Safari), for it seems to work best for JavaScript, and the developer tools it offers are pretty good.

First, we started with the basics of how the d3.js library and other JavaScript libraries, such as jquery or leaflet, are incorporated into basic HTML pages. D3 is an open source library developed by Mike Bostocks. All the ‘visualisation magic’ happens in the browser, which takes the HTML file and processes the scripts as displayed in the console. The data used in the visualisation is pulled into the console, thus you cannot hide the data.

For this visualisation (D3 Visual Elements), the browser uses the content of the HTML file to call the d3.js library and the data into the console. In this example, the HTML contains a bit of CSS and SVG (Scalable Vector Graphics) element with a d3.js script which pulls data from a CSV file containing the details: author and number of books. The visualisation displays the authors’ names and bars representing the number of books each author has written. The bars change colour and display the number of books when you hover over.

Visualising CSV data with D3 JavaScript library

The second visualisation we worked on was the combination of geo-referenced data and leaflet.js library. Here, we combine the d3.js and leaflet.js libraries to display geographic data from a CSV file. First we ensured the OpenStreetMap loaded, then pulled the CSV data in and last customised the map using a different map tile. We also added data points to the map and pop-up tags.

Visualising CSV data using leaflet JavaScript library

In this 2-hour workshop, Uta Hinrichs managed to give a flavour of the possibilities that JavaScript libraries offer and how ‘relatively easy’ it is to visualise data online.

Workshop links:

Other links:

Rocio von Jungenfeld
EDINA and Data Library

Dealing with Data 2015 – Call For Papers

uoelogo-1024x164

Date:                     Monday 31 August 2015, 9:30 – 16:00 (lunch provided)

Location:             Informatics Forum, University of Edinburgh

Themes:

Data creation, including non-traditional data types
Data analysis
Data visualisation
Data security
Working with sensitive data
Archiving and sharing data, including preservation, re-use, and licensing
Infrastructure and tools, for example Electronic Lab Notebooks
Research software development and preservation
Linked open data for research, working with government data
Big Data and data mining
Meeting funder requirements for research data management

Format:           

Presentations will be 15 minutes long, with 5 minutes for questions. Depending on numbers, thematic parallel strands may be used.  Presentations will be aimed at an academic audience, but from a wide range of disciplines. Opening and closing keynote presentations will be given.

Call for proposals:

Leading edge research is reliant upon data that are produced or collected during the research process, or on existing data that is being analysed and re-used in new research questions. It is important to effectively manage research data throughout the lifecycle, from data management planning through to archiving and sharing.  Requirements for managing data are increasingly being adopted by institutions and funders in order to foster good research data management practices.

Following on from  the successful ‘Dealing with Data 2014’ half-day conference, Information Services are pleased to announce that we will be hosting a one-day conference covering a broad range of research matters from all disciplines on the subject of ‘Dealing with Data’.

The aim of the conference is for researchers of all levels at the University of Edinburgh to share good practice, emerging techniques and technologies, and practical examples in working with data across the research lifecycle.

We welcome proposals for presentations on any aspect of the challenges and advances in working with data, particularly research with novel methods of creating, using, storing, visualising or sharing data.  A list of themes is given above, although proposals that cover any aspect of working with research data will be considered.

Please send abstracts (maximum 500 words) to dealing-with-data-conference@mlist.is.ed.ac.uk   before Monday 6th July 2015.  Proposals will be reviewed and the programme compiled by Friday 31st July 2015.

Cuna Ekmekcioglu
Library and University Collections

Edinburgh DataShare – new features for users and depositors

I was asked recently on Twitter if our data library was still happily using DSpace for data – the topic of a 2009 presentation I gave at a DSpace User Group meeting. In responding (answer: yes!) I recalled that I’d intended to blog about some of the rich new features we’ve either adopted from the open source community or developed ourselves to deliver our data users and depositors a better service and fulfill deliverables in the University’s Research Data Management Roadmap.

Edinburgh DataShare was built as an output of the DISC-UK DataShare project, which explored pathways for academics to share their research data over the Internet at the Universities of Edinburgh, Oxford and Southampton (2007-2009). The repository is based on DSpace software, the most popular open source repository system in use, globally.  Managed by the Data Library team within Information Services, it is now a key component in the UoE’s Research Data Programme, endorsed by its academic-led steering group.

An open access, institutional data repository, Edinburgh DataShare currently holds 246 datasets across collections in 17 out of 22 communities (schools) of the University and is listed in the Re3data Registry of Research Data Repositories and indexed by Thomson-Reuters’ Data Citation Index.

Last autumn, the university joined DataCite, an international standards body that assigns persistent identifiers in the form of Digital Object Identifiers (DOIs) to datasets. DOIs are now assigned to every item in the repository, and are included in the citation that appears on each landing page. This helps to ensure that even after the DataShare system no longer exists, as long as the data have a home, the DOI will be able to direct the user to the new location. Just as importantly, it helps data creators gain credit for their published data through proper data citation in textual publications, including their own journal articles that explain the results of their data analyses.

CaptureThe autumn release also streamlined our batch ingest process to assist depositors with large and voluminous data files by getting around the web upload front-end. Currently we are able to accept files up to 10 GB in size but we are being challenged to allow ever greater file sizes.

Making the most of metadata

Discover panel screenshot

Example from Geosciences community

Every landing page (home, community, collection) now has a ‘Discover’ panel giving top hits for each metadata field (such as subject classification, keyword, funder, data type, spatial coverage). The panel acts as a filter when drilling down to different levels,  allowing the most common values to be ‘discovered’ within each section.

The usage statistics at each level  are now publicly viewable as well, so depositors and others can see how often an item is viewed or downloaded. This is useful for many reasons. Users can see what is most useful in the repository; depositors can see if their datasets are being used; stakeholders can compare the success of different communities. By being completely open and transparent, this is a step towards ‘alt-metrics’ or alternative ways measuring scholarly or scientific impact. The repository is now also part of IRUS-UK, (Institutional Repository Usage Statistics UK), which uses the COUNTER standard to make repository usage statistics nationally comparable.

What’s coming?

Stay tuned for future improvements around a new look and feel, preview and display by data type, streaming support, bittorent downloading, and Linked Open Data.

Robin Rice
EDINA and Data Library

Highlights from the RDM Programme Progress Report: Jan – Feb 2015

The Library and University Collections (L&UC) in association with project partner Manchester University received funding from the Jisc “Research Data Spring” programme to define and develop an open source Data Vault application which will allow data creators to describe and store data safely in one of the growing number of archival storage options. Phase 1 of the project started in March 2015.

The University of Edinburgh (UoE) were invited to contribute to a series of EPSRC (Engineering and Physical Sciences Research Council) Compliance Case Studies. Stuart MacDonald, RDM Service Coordinator, was interviewed by Jisc and the DCC in relation to the RDM programme and institutional compliancy with forthcoming EPSRC research data expectations. The case study will be published on the Jisc website in May 2015.

RDM Service Coordinator Stuart MacDonald co-presented with Rory Macneil (RSpace) their practice paper “Service Integration to Enhance RDM: RSpace electronic laboratory notebook (ELN) case study” at the International Conference on Digital Curation (IDCC) in London (Feb 2015). The paper has been published in the International Journal of Digital Curation (http://www.ijdc.net/index.php/ijdc/article/view/10.1.163), open access.

The RDM Service Coordinator also presented on ‘RDM Training Initiatives @ Edinburgh’ at the “Comparing Notes: Training Librarians for Research Data Management and Open Science Support” workshop at IDCC.

An EPSRC Expectations Awareness Survey was sent out to 98 EPSRC grant holders of which 38 responded. 9** grant holders agreed to participate in a follow-up interview. The findings of the interviews will follow shortly. Dr Evamaria Krause (Marburgh University, Germany) completed a 6 week internship with L&UC where she assisted with the EPSRC Expectations Awareness Survey and EPSRC grant holder interview exercises.

All Schools in the College of Humanities and Social Science (CHSS) have now added links to RDM Programme website and other RDM pages via their intranets. RDM Project Plan deadlines and deliverables which underpin the RDM Roadmap have been updated.* For more details visit the RDM Programme wiki (some content only available to UoE staff).

Four tailored Data Management Plans sessions have been organised with research groups in the College of Medicine and Veterinary Medicine and CHSS, and two workshops for the European Association for Health Information and Libraries (EAHIL) conference in Edinburgh are scheduled to run in June 2015.

Edinburgh DataShare release 1.71 has been announced with new features including faceted browsing, SOLR usage statistics, size limit on assisted deposit of items increased from 5Gb to 10Gb.

DataSync (a Dropbox-like service in development) was themed and made available for beta testing to Information Services colleagues.

Links:

* IT Infrastructure input pending
** 1 PhD student who was forwarded the survey agreed to be interviewed

Stuart Macdonald
RDM Service Coordinator