Training researchers for a software and data-intensive world with Edinburgh Carpentries

This is guest post from Giacomo Peru and the EdCarp Committee (https://edcarp.github.io/committee/). Sections of this post were published previously on the EPCC blog.

EdCarpLogo

EdCarpLogo

The Edinburgh Carpentries (EdCarp) is a training initiative, which offers the Carpentries computing and data skills curriculum in Edinburgh. The workshops train researchers on fundamental skills needed for conducting efficient, open, and reproducible research. The EdCarp team comprises staff and student volunteers from across disciplines, academic units, and career stages.

Since 2018, EdCarp has organised 25 workshops across the academic institution, training over 300 staff and students in data cleaning, manipulation, visualisation and version control methods using tools such as R, python, Unix shell, Git, SQL and OpenRefine. Courses are free to participants and are oversubscribed very quickly. We are now rolling out our 2020 schedule and announcing workshops.

EdCarp are working to establish collaborations with other organisations, external and internal to the university: the Scottish Funding Council, the Institute for Academic Development and the Data Driven Innovation programme.

EdCarp can work with your academic unit or doctoral training program to help promote the fundamental data skills that your colleagues need.

A crucial aspect of EdCarp and their training model is the participation and voluntary commitment of the community, where trainees go to become helpers, helpers to instructors and so on.  EdCarp are always looking for new people willing to help, in any capacity; please sign up here if you would like to be kept updated and/or get involved: https://eepurl.com/gl4MsX.

 

Updated MANTRA content: Research data in context

The Research Data Support team is pleased to announce the launch of the first in a series of updates to MANTRA, the free and open online research data management training course.

The first updated module ‘Research data in context’ (previously ‘Research data explained’) is now live on the MANTRA site and provides an introduction to research data, alongside detail on the contexts in which data are generated, and the challenges presented by big data in society.

MANTRA is designed to give post-graduate students, early career researchers, and information professionals the knowledge and skills needed to work effectively with research data.

Since launching in 2011, MANTRA has been through a number of significant rewrites to keep up with current trends, and over 10,000 different learners have visited MANTRA in the last academic year.

The ‘Research data in context’ module has been substantially revised in order to:

  • remove dated and obsolete content;
  • simplify and improve the readability of existing material;
  • add information on data literacy and data science.

The changes in this module include:

  • Revised pages: Introduction; Why is research data management important?; What are data?; What are research data?; Data as research output; Module Summary; Next & further reading.
  • New pages: Data in society; Data Science; Video: machine learning; Data literacy and skills.

A change log detailing all changes in this release is available on request from the Research Data Support team (data-support@ed.ac.uk).

We hope you find this update interesting and useful and welcome any feedback you may have.

Further MANTRA updates are forthcoming, focusing on FAIR data and newer data protection legislation and we will announce these in future blog posts.

Bob Sanders
Research Data Support

Data Carpentry & Software Carpentry workshops

The Research Data Service hosted back to back 2-day workshops in the Main Library this week, run by the Software Sustainability Institute (SSI) to train University of Edinburgh researchers in basic data science and research computing skills.

Learners at Data Carpentry workshop

Learners at Data Carpentry workshop

Software Carpentry (SC) is a popular global initiative originating in the US, aimed at training researchers in good practice in writing, storing and sharing code. Both SC and its newer offshoot, Data Carpentry, teaches methods and tools that helps researchers makes their science reproducible. The SSI, based at Edinburgh Parallel Computing Centre (EPCC), organises workshops for both throughout the UK.

Martin Callaghan, University of Leeds

Martin Callaghan, University of Leeds, introduces goals of Data Carpentry workshop.

Each workshop is taught by trainers trained by the SC organisation, using proven methods of delivery, to learners using their own laptops, and with plenty of support by knowledgeable helpers. Instructors at our workshops were from Leeds and EPCC. Comments from the learners – staff and postgraduate students from a range of schools, included, ‘Variety of needs and academic activities/disciplines catered for. Useful exercies and explanations,’ and ‘Very powerful tools.’

Lessons can vary between different workshops, depending on the level of the learners and their requirements, as determined by a pre-workshop survey. The Data Carpentry workshop on Monday and Tuesday included:

  • Using spreadsheets effectively
  • OpenRefine
  • Introduction to R
  • R and visualisation
  • Databases and SQL
  • Using R with SQLite
  • Managing Research & Data Management Plans

The Software Carpentry workshop was aimed at researchers who write their own code, and covered the following topics:

  • Introduction to the Shell
  • Version Control
  • Introduction to Python
  • Using the Shell (scripts)
  • Version Control (with Github)
  • Open Science and Open Research
Software Carpentry learners

Software Carpentry learners

Clearly the workshops were valued by learners and very worthwhile. The team will consider how it can offer similar workshops in the future at a similarly low cost; your ideas welcome!

Robin Rice
EDINA and Data Library

Analytics platform trial

Information Services is evaluating a new collaborative platform for data-science and analytics as part of its expanding portfolio of services for researchers. We are looking for researchers with suitable problems who expect to achieve results in the one-year trial. We will be able to work closely with a small number of projects to help them get the most out of the platform, and training will be available. In addition, we encourage further researchers to use the platform with less formal support.

The Aridhia AnalytiXagility Platform

AnalytiXagility is a purpose-built, user-friendly, collaborative platform for data science and analytics. It allows your team to easily create, discuss, modify and share analyses in a single, secure system accessed conveniently through a web browser.
The platform handles routine data management tasks such as confidentiality, availability, integrity and audit, reducing time to insight and discovery. In particular, it is ideally suited for:

  • Exploring, comparing and linking structured datasets including data quality profiling
  • Supporting data management, accountability and provenance
  • Processing large datasets that do not fit in memory

Bring your team

Project members collaborate through a private workspace configured with compute, storage and analytical tools. Embedded social media tools allow teams to post and share questions, updates, comments and insights, building an active record of the research undertaken.

Bring your data

Users import their datasets using the secure and reliable file transfer mechanism, SFTP. Working files (documents, images, analysis scripts) can be uploaded directly through the web interface, and tagged for easy management and retrieval by the team.

Bring your analysis

AnalytiXagility provides an analysis platform, based on R, which can be accessed through a web browser. Combining R with an SQL database and an associated access library allows researchers to analyse their data in a faster and more scalable way than with R alone.

Generate your output

The platform supports generation of PDF reports for communication and publication using LaTeX templates, such as those provided by many leading journals, in which users can embed active analytical scripts to auto-generate images and tabular data within the report at runtime.

More information

If you are interested in participating in the trial, please email IS.Helpline@ed.ac.uk with the subject “XAP Trial”.

Further information can be found at:

Steve Thorn
Research Services
IT Infrastructure