Tag Archives: Digital

CSV Mind-Blow(ve)n!

May 9, 2016UncategorizedConferences, crowdsourcing, csv, Development, Digital, images, Metadata Games, open source, spreadsheets, technologyScott Renton

PSV Eindhoven - Eredivisie champions — Never waste a football-related pun. It’s been a good week for both PSV and CSV.

I was lucky enough to have a paper accepted to Csv,conf,2* in Berlin on the 3rd-4th of May, which was great to do, but it also got me through the door to see loads of great things going on in data and its surrounding technology. Yes, there was heavy mention made of CSV and Spreadsheets; in fact, at times it was akin to an AA meeting ,with people guiltily admitting their love of Excel. This left me feeling- quite worryingly- vindicated in a lot of the things I do.

As is always the case with any conference review blogpost, it’s not viable to list every link or ruminate on the message of every talk, so I’ll just home in on a few highlights. The talks (available in slide or video form) are appearing over at Lanyrd.com, and they’ll give a lot more depth to what was spoken about.

My own reason for being there- as far as my talk was concerned- was to look at better ways of processing workflow, enriching data, and improving engagement with our collections. Afterwards, I had some interesting conversations: I was alerted to the tool NeuralTalk2 by Maciej Gryka of rainforestqa.com, a company that specialises in cleaning data using mechanical turks and crowdsourced test-cases. Neural Talk, though, is a captioning tool, which will attempt to visually recognise what your image is “of”. I’m sure it fails as much as it succeeds, but, as I pointed out, we’ve not really used this kind of tech to enhance our metadata, so there’d be no harm in running some of our images through and seeing what it comes up with. Another chat, with a lady from UC Santa Cruz, made it clear that we are quite liberal with our approach to crowdsourced data. Where we have generally decided it’s fine to surface as long as it is properly marked as such, they are proceeding rather slowly, due to a particularly strict metadata librarian.

The keynotes were deliberately intended to cover a range of disciplines that might be new to most people at this highly eclectic conference. Resultantly, there were interesting talks on technology and activism (including visualisations of the Ebola crisis and police brutality); ethics in technology and workflows to give your consent without clicking on unreadable terms and conditions (do you know what SmartBins are taking from you as you pass?); dealing with messy spreadsheets (the Enron crisis showed this institution to manage them terribly), and open data with neuroscience (a lot of mouse brains in action).

Some other tools that we could be looking at:

Zegami – great for exploring large banks of images and spotting patterns across them. Can it work with IIIF, I wonder?

OpenRefine– a tool that we perhaps should have been using for some time to rationalise spreadsheet data, which could certainly save lots of time. Our former colleague, Richard Jones of cottagelabs, is a great advocate of these kinds of tools, as his talk made clear.

DataBaker– created in collaboration between the Office of National Statistics and ScraperWiki. This Python application can convert any ‘pretty’ spreadsheet into usable source data in CSV.

CSV Rinse and Repeat– built in Paris by Mathieu Jacomy of the Paris MediaLab, this is a JavaScript tool which intends to cut down the distance between data, coding and visualisation. Basically you take your data, spot patterns, recode to surface interesting things, and generate a visualisation out the end, in an iterative process.

Wikipedia Googlesheets– I am not sure if we would have a use for this specifically, but it was fascinating to see a plugin which serves up spreadsheet formulae coded in Google Apps JavaScript, which can then be used to interrogate any Wikipedia pages. Particularly of note is the ability to combine category pages and pageviews, to see if real-time events are influenced by Wikipedia and vice-versa. Developed by Thomas Steiner at Google.

Finally, here are three interesting observations, which certainly struck a chord with me:

It is now deemed quite acceptable, as a symptom of rapid development, perhaps, that the CSV is used as the master dataset; perhaps the file-based database’s day is not over. I certainly found out about some interesting applications built in this way.
I heard no-one but myself talk about Excel macros- at a spreadsheet conference, no less! It is far more fashionable these days to read your data as csv, and code against it using JavaScript, R, or Python. I clearly need to get out of the 1990s.
EVERYONE suffers from problems with diacritics, glyphs, badly formatted data and what happens when you import a CSV into a spreadsheet tool that tries to be too clever. It is not just me.

All in all, an excellent couple of days, which have filled me with ideas for improvements for existing workflows. Hopefully the likes of the DIU will reap some benefits!

Berlin Dom und Fernsehturm — Insert standard caption regarding “old content meets new technology to surface it”!

Scott Renton, Digital Developer

*The commas are intentional, by the way!

IIIF – International Image Interoperability Framework

December 18, 2015UncategorizedConferences, Development, Digital, IIIF, images, LUNAcknowles

The next big thing — Inspirational quote on the side of a University of Ghent building, St. Pietersnieuwstraat 33.

The adoption of IIIF (International Image Interoperability Framework) has been gaining momentum over the past few years for digitised images. Adoption of IIIF for serving images allows users to rotate, zoom, crop, and compare images from different institutions side by side. Scott and I attended the IIIF conference in Ghent earlier this month to learn more about IIIF, so we can decide how we can move forward at the University of Edinburgh to adopt IIIF for our images.

On the Monday we attended a technical meeting at the University of Ghent Library, this session really helped us to understand the architecture of the two IIIF APIs (image and presentation) and speak to others who have implemented IIIF at their institutions.

The main event was on Tuesday at the beautiful Ghent Opera House, where there were lots of short presentations about different use-cases for IIIF adoption and the different applications that have been developed. If you are interested in adoption IIIF at your institution I recommend looking at Glen Robson’s slides on how the National Library of Wales has implemented IIIF. I can see myself coming back to these slides again and again, along with those on the two APIs.

Whilst we were in Ghent there was a timely update from LUNA Imaging, whose application we use as an imaging repository on their plans to support IIIF.

Thanks to everyone we met in Ghent who was willing to share with us their experiences of implementing IIIF and to the organisers for a great event in a beautiful city (and our stickers).

If you want to keep up to date with IIIF development please join the Google Group iiif-discuss@googlegroups.com

Claire Knowles and Scott Renton

Library Digital Development Team

Bridging Gaps at the British Museum

October 27, 2015UncategorizedCollections, Conferences, Development, Digital, libraryScott Renton

The overwhelming setting of the British Museum played host to this year’s Museums Computer Group “Museums and the Web” Conference, and as usual, a big turnout from museums institutions all over the UK came, bursting with ideas and enthusiasm. The theme (“Bridging Gaps and Making Connections”) was intended to encourage thought about identifying creative spaces between physical museums collections and digital developments, where such spaces are perhaps too big, and how they can be exploited. As usual, there was far too much interesting content to cover fully in a blogpost- everything was thought-provoking, but I’ve picked out a few highlights.

Two projects highlighted collaboration between museums, which can be creatively explosive, and immediately improve engagement. Russell Dornan at The Wellcome Institute showed us #MuseumInstaSwap, where museums paired off and filled their social media feeds with the other museum’s content. Raphael Chanay at MuseoMix, meanwhile, arguably took this a step further by getting multiple institutions to bring their objects to a neutral location (Iron Bridge in Shropshire, Derby Silk Mill), and forming teams to build creative prototypes out of them across the digital and physical spaces. Could our museums collections be exploited in similar ways? Who could we partner up with?

I like to think that our “digital and physical” teams in L&UC collaborate very effectively. Keynote speaker John Coburn from TWAM (Tyne and Wear Archives and Museums) spoke of the importance of this intra-institution collaboration. You will (almost) never find a project that is run entirely from within the digital or physical sphere (Fiona Talbott from the HLF confirmed this- 510 of 512 recent bids had digital outputs relating to physical content), and the ability of the digital area and the content providers to communicate and work together is key. One very good example of this was the Tributaries app, built with sound artists, the history team, archives and so on, to put together an immersive audio experience of lost Tyneside voices from World War I. He also spoke of their TNT (Try New Things) initiative (also creatively explosive!) where staff sign up to do innovation with the collections, effectively in their spare time. With the Innovation Fund encouraging creativity, how do we work this into our daily lives? Can we? If not, how do we incentivise people to do it outwith their spare time? One of the gloomier observations of the day was that, with austerity, there is less and less money in the sector, which is likely to get worse after next month’s spending review. This austerity can breed creativity, though, and it’s good for digital, because people need to ‘work smarter’.

Another really interesting project is going on at the Tate, where they are combining their content with the Khan Academy learning platform. Rebecca Sinker and colleagues showed us how content can be levered and resurrected through a series of video tutorials around the content (be they archival, technical, biographical etc). Pushing the collaborative textual content from the comments area on the tutorials through to social media allows further engagement and new perspectives on the museum objects. Speaking personally, I have had little exposure to our VLE, but I’m quite sure that developing an interface between it and our collections sites could be highly beneficial.

That’s all the tip of the iceberg, though, so take a look at the programme link at the top to find out about lots of other interesting projects.

Outside of the lecture theatre, I had some really interesting conversations with people who have exactly the same problems as ourselves: building image management workflows, incorporating technological enhancements to content-driven websites, and thinking about beacon technology (the sponsors, Beacontent, deserver top marks for the name at least). Additionally, a tour of The Samsung Digital Discovery Centre– where state of the art technology meets British Museum content to improve the experience for children, teenagers, and families- was highly informative.

Scott Renton, Digital Developer

Library Labs Blog

University of Edinburgh Library Labs Blog

Tag Archives: Digital

CSV Mind-Blow(ve)n!

IIIF – International Image Interoperability Framework

Bridging Gaps at the British Museum