Mass digitisation at the Library Annexe

On Wednesday 22 June, Hannah Mateer and I welcomed the KEW attendees to the Library Annexe at South Gyle. Hannah gave a tour of the Annexe space and services while I discussed the different digitisation services offered by the Library, in particular the PhD thesis mass digitisation project, which aims to have the University’s entire thesis collection digitised and online within three years. Here is a summary of some of the project’s key points:

  • We will digitise around 15,000 PhD theses over the next two years; the library has a collection of around 25,000 and 10,000 are already online.
  • The collection dates form the early 1600s and contains original Edinburgh research which is not available anywhere else in the world.
  • Statistics have shown that digital theses are accessed, on average, 30 times per month each. There is, therefore, considerable demand to put the collection online.
  • We chose to digitise in house for several reasons. While it may have been slightly cheaper to outsource, we wanted more oversight over fragile collections and workflows and we wanted to develop expertise in the area of mass digitisation.
  • This approach will provide us with scanning equipment and software for future digitisation projects.
  • The theses collection is made up of unique items, where there is only one copy, and duplicates, where two or more copies exist.
  • Duplicate theses have their spines removed using a guillotine and are then fed through a Kodak i4250 document scanner; unique theses are scanned on a Copibook Cobalt scanner and all theses are batch processed using LIMB processing software.
  • The digitised, OCR-ed theses are then made available through ERA, the library’s institutional repository.
  • There is also a conservation and cataloguing element to the project – a large proportion of the collection has no digital catalogue record and several thousand require conservation treatment.
  • By the end of the project we will have one physical copy of every thesis as well as a digital copy in the online repository. All physical and digital theses will have digital catalogue records and conservation treatment will have been performed on those theses which require it.

Following my presentation, we had a very interesting discussion about how different institutions approach digitisation: I was particularly interested to learn that several of the attendees’ organisations had already undertaken similar projects and I am very keen to learn more from their experiences as this project progresses. If you’d like to find out more about what we’re doing with mass digitisation at the University of Edinburgh, please see our blog http://libraryblogs.is.ed.ac.uk/phddigitisation/ and feel free to get in touch if you have any questions.

Gavin Willshaw, Digital Curator

Gavin.Willshaw@ed.ac.uk

Leave a Reply

Your email address will not be published. Required fields are marked *