Mmm, let me tell you a bit more about what I’m getting up to. I’ve often waxed lyrical about The Old Bailey Proceedings Online (and see the OBP Blog Symposium.). So it’s rather delightfully serendipitous that my new job is as project manager for two new, related London history projects, based in the Humanities Research Institute at the University of Sheffield.
The first and relatively simple task is to complete the OBP job by adding the final run of proceedings from 1834-1913 (under the title of Central Criminal Court proceedings), integrating them into the existing site. In total, this will create a fully searchable major digital primary source for London history, and particularly for the history of non-elite Londoners, running right through from the late 17th century into the early 20th century.
The 18th-century project, Plebeian Lives and the Making of Modern London 1690-1800, is much more difficult and complex. Like many other early modern and 19th-century digital primary sources, the OB/CCC proceedings are printed texts – relatively easy to read and transcribe, and to mark up for digitisation. But the majority of the Plebeian Lives sources will be archival manuscript materials. They will cover a wide range: including legal records such as coroners’ inquests; parish records (eg: pauper letters, vestry minute books); the records of Bridewell and Bethlem hospital; apprenticeship records. There’ll also be printed texts, such as Ordinary’s Accounts.
Like the Old Bailey/Central Criminal Court databases, they’ll all end up online: thousands of documents, full text, fully searchable, freely available to all internet users without any subscription barriers. What’s more, we hope to construct a search engine that will make it possible to simultaneously search a number of related online primary source resources alongside ours, including the OBP, and others at different sites such as British History Online.
This is the goal, at least. (I am terrified, whenever I stop being insanely excited.) Right now, all I have for this is a humungous (1 terabyte) hard drive filled with the first batch of scanned document images (very large, high quality .tif files, which is why they take up so much drive space).
The practical difficulties are not minor. Every phase of the process is lengthy and much of it (to be honest) fairly tedious, for both projects. All those documents and printed texts must first of all be microfilmed, scanned, and ‘rekeyed’ (transcribed): that part of it is outsourced, although we have to produce various documentation to guide the rekeyers (and generally nag and cajole the contractors to give us what we want when we want it). Some of the documents will be much better preserved and/or easier to decipher than others.
Then we have to mark up the transcripts in XML, another dull and painstaking task, which will be undertaken in two ways over the next 2 years or so. Right now and with my, um, ‘help’, the HRI programmers are writing fearfully complicated programs that will do substantial sections of the CCC transcripts automatically; the rest will be done manually by several part-time, home-based workers (some of them are postgrad students) who will start this autumn.
Once that markup is done, the CCC project will be quite straightforward to finish off, since it will be essentially a matter of adding it to the existing OBP database and giving it a few tweaks. But for our 18th-century plebeians, our job will barely have begun.
Firstly, the HRI people have to create a powerful search engine that anyone can use fairly easily and, of course, we have to create a web site to present it. We hope that many people with 18th-century interests, from genealogists to academics, will find their own ways of using the resource. What we want to do with it is to analyse the data in order to “reconstruct how ‘ordinary’ Londoners interacted with various government and charitable institutions in the course of their daily lives”. We’ll be doing large scale quantitative analysis and record linkage (to find out, for example, patterns of relationships between claiming poor relief and ending up as a victim or perpetrator of crime). The technique of nominal record linkage has tended to be applied to small rural populations: the computer made record linkage practical in the first place, now the internet is making possible the extension of its methods to the teeming metropolis. On the other hand, we want to do qualitative analysis: where we can find rich enough information about individuals, we’ll trace their individual experiences and uses of the institutions available to them.
I (eventually) get the fun job of writing biographies to put on the website. My bosses have to sit down and write the serious monograph.
I think I have one of the coolest jobs in the universe right now.
. . .
[Parts of this post have been revised and x-posted at my other new bloghome, The Long Eighteenth Century.]