At present I am working on a pilot project, digitising the Scottish Court of Session Papers. The collection is held across three institutions; The Advocate’s Library, The Signet Library and the University of Edinburgh’s Library and University Collections. The collection itself consists of circa 6500 volumes, comprising court cases which span the 18th and 19th century.
The aim of the pilot is to determine the most effect digitisation methods for these materials with a view to a potential mass digitisation project covering the entire collection. The digitisation tests and experiments I have been undertaking have raised the many challenges that such a large project would present, namely around the issue of recording metadata and which digitisation practices to employ in relation to the condition and size of any particular volume.
This variety in the size and condition means that, from a digitisation point of view, a one-size-fits-all approach will not be possible. The images below do well to explain this. Some volumes are over 18cm deep, some are loose papers, many volumes contain foldouts, many pages are stuck together and in a deteriorated condition, and volumes often contain text that sits very deep towards the spine of the book. I have been in regular contact with CRC conservator, Emily Hick, as various conservation issues seem to pop up as I move from volume to volume. With the more fragile volumes one must concede that further damage will likely be done during the digitisation process. Emily has been surveying the entire collection and evaluating which volumes may need conservation treatment prior to digitisation. In light of this, those volumes that are in a particularly bad condition have been excluded from the digitisation trials.
I have carried out tests on book scanners both here in the DIU and alongside the Thesis Digitisation team at the Library Annexe. Where spines have been more fragile I have been forced to use our Nikon D800 camera to capture page by page, recto/verso. I was also kindly given the opportunity to trial the high-speed Atiz BookDrive (pictured below – http://atiz.com/) at the Royal Botanic Gardens Edinburgh which, generally speaking, has been the most suitable piece of equipment for these materials so far. Futhermore, towards the end of last year Head Photographer, Susan Pettigrew, and I were lucky enough to get an insightful tour of the National Library of Scotland’s mass digitisation facilities. There we learned about their watertight digitisation workflow, many aspects of which we would seek to integrate into the Session Papers project.
Where there are foldouts, and similar, it appears that a more bespoke photography approach will be needed to digitise them. In order to accommodate some of the larger foldouts for example, resetting the studio lights and camera position is required. This obviously takes time and unfortunately goes against the grain of mass digitisation, which by its very nature seems to call out for speed and efficiency with maximum output. These items would need to be flagged up and effectively incorporated into the workflow.
The vast majority of the Session Papers sit uncatalogued at present. Where cataloguing has been done it is rather minimal, pertaining to the volumes themselves and not the individual cases. The result of this is that there is virtually no pre-existing metadata to link with the images created through digitisation. Each volume is made up of a number of individual cases, with pages numbered exclusively for each case. Creating new metadata to signify each case and page is therefore very time consuming – more cases means more metadata. In order to streamline the metadata workflow, we are currently investigating ways of auto-generating metadata through OCR techniques and we are also looking at potential metadata crowdsourcing avenues.
Recently, I have been gathering quotes for other equipment on the market that is designed specifically for the mass digitisation of books. Despite there being a large part of the collection that is in very poor condition, there is, however, a significant number of volumes that would work very well with these high-end bookscanning systems.
The next stage is to test how best to output the digitised images. We will shortly be uploading batches to via Luna Imaging and looking to make the most of its recent integration with International Image Interoperability Framework (IIIF) software. IIIF works as an online platform where institutions worldwide can share their collections and potentially bring disparate collections together in one accessible domain. As a project, the Session Papers could really feel the benefit of the IIIF functions by the fact that the collection exists over several institutions.