Defendants’ voices and silences in the Old Bailey courtroom, 1781-1880

This is a version of the paper I gave at the Digital Panopticon launch conference at Liverpool in September 2017.

In the interests of fostering reproducible research in the humanities, I’ve put all the data and R code underlying this paper online on Github – details of where to find them are at the end.

Defendant speech and verdicts in the Old Bailey

Defendants’ voices are at the heart of the Digital Panopticon Voices of Authority research theme I’ve been working on with Tim Hitchcock. We know that defendants were speaking less in Old Bailey Online trials as the 19th century went on; we’ve tended to put this in the context of growing bureaucratisation and the rise of plea bargaining.

I want to think about it slightly differently in this paper though. The graph above compares conviction/acquittal for defendants who spoke and those who remained silent, in trials containing direct speech between 1781 and 1880. It suggests that for defendants themselves, their voices were a liability. This won’t surprise those who’ve read historians’ depiction of the plight that defendants found themselves in 18th-century courtrooms without defence lawyers, in the “Accused Speaks” model of the criminal trial (eg Langbein, Beattie).

But this isn’t a story of bureaucrats silencing defendants (or lawyers riding in to the rescue). I want to suggest that, once defendants had alternatives to speaking for themselves (ie, representation by lawyers and/or plea bargaining), they made the choice to fall silent because it was often in their best interests.

About the “Old Bailey Voices” Data

  • Brings together Old Bailey Online and Old Bailey Corpus (with some additional tagging, explained in more detail in the documentation on Github)
  • Combines linguistic tagging (direct speech, speaker roles) and structured trials tagging (including verdicts and sentences)
  • Single defendant trials only, 1781-1880
  • 20700 trials in 227 OBO sessions
  • 15850 of the trials contain first-person speech tagged by OBC

The Old Bailey Corpus, created by Magnus Huber, enhanced a large sample of the OBP 1720-1913 for linguistic analysis, including tagging of direct speech and tagging about speakers. [In total: 407 Proceedings, ca. 14 million spoken words, ca. 750,000 spoken words/decade.]

Trials with multiple defendants have been excluded from the dataset because of the added complexity of matching the right speaker to utterances (and they aren’t always individually named in any case). [But of course this begs the question of whether the dynamics and outcomes of multi-defendant trials might be different…]

Trial outcomes have also been simplified; if there are multiple verdicts or sentences only the most “serious” is retained. Also, for this paper I include only trials ending in guilty/not guilty verdicts, omitting a handful of ‘special verdicts’ etc.


Working assumption is that nearly all silent defendants do have a lawyer and the majority of defendants who speak, don’t.

Sometimes, especially in early decades, defendants had a lawyer and also spoke. Unfortunately, the OBC tagging doesn’t distinguish between prosecution and defence lawyers, and not all lawyer speech was actually reported.

But, more seriously, is it safe to assume that ‘silent’ defendants were really silent? Occasionally defendant speech was actually censored in the Proceedings (in trials where other speech was reported), eg a man on trial for seditious libel in 1822 whose defence “was of such a nature as to shock the ears of every person present, and is of course unfit for publication”. But that was a very unusual, political, case. (See t18220522-82 and Google Books, Trial of Humphrey Boyle)

[However, it was suggested in questions after the presentation that maybe the issue isn’t so much total censorship as in the case above, but that the words of convicted defendants might be more likely to be partially censored, which would problematise analyses that centre on extent and content of their words. This could be a particular problem in 1780s and 1790s; maybe less so later on.]

So work to be done here – eg, look at trials with alternative reports specifically to consider defendants’ words.

Distribution of trials by decade 1781-1880

Start with some broad context.

The number of cases peaked during the 1840s and dramatically fell in the 1850s. (Following the Criminal Justice Act 1855, many simple larceny cases were transferred to magistrates’ courts.)

Percentage of trials containing speech, annually

Percentage climbs from 1780s (in 1778 Proceedings became near-official record of court), peaks early 19th c and then after major criminal justice reforms of late 1820s swept away most of the Bloody Code, shown by red line, substantial fall in proportion of trials containing speech.

This was primarily due to increase in guilty pleas, which were previously rare. After the reforms, 2/3 of trials without speech are guilty pleas.

Conviction rates annually, including guilty pleas

(Ignore the spike around 1792, due to censorship of acquittals.) Gradual increase in conviction rates which declines again after mid 19th c.

But if we exclude guilty pleas and look only at jury trials, the pattern is rather different.

Conviction rates annually, excluding guilty pleas

Conviction rates in jury trials after the 1820s rapidly decrease – not much over 60% by end of 1870s. That’s much closer to 18th-century conviction rates (when nearly all defendants pleaded not guilty), in spite of all the transformations inside and outside the courtroom in between.

Percentage of trials in which the defendant speaks, annually

Here the green line is the Prisoners’ Counsel Act of 1836, which afforded all prisoners the right to full legal representation. But the smoothed trend line indicates that it had no significant impact on defendant speech. Defendants had, at the judge’s discretion, been permitted defence counsel to examine and cross-examine witnesses since the 1730s.Legal historians emphasise the transformative effect of the Act; but from defendants’ point of view it seems less important; for them it was already a done deal and the Bloody Code reforms were much more significant.

Defendant speech/silence and verdicts, by decade

This breaks down the first graph by decade – shows that the general pattern is consistent throughout period, though exact % and proportions do vary.

Defendant speech/silence/guilty pleas and sentences

Moreover, harsher outcomes for defendants who speak continues into sentencing. Pleading guilty (though bear in mind this only really applies to c.1830-1880, whereas silent/speaks bars are for whole period) most likely to result in imprisonment, much less likely to receive transportation (and hardly ever death) sentence. Defendants who speak are the most likely to face tougher sentences – death or transportation, more so than the silent.

(Don’t yet have actual punishments – the next big job is getting the linked Digital Panopticon life archives…)

Defendant word counts (all words spoken in a trial)

How much did defendants say? Not a lot. The largest single group of defendants is the silent (ie, 0 words). But even those who spoke usually didn’t say very much. [average overall was 55 words] Eloquent, articulate defendants few and far between!

Defendant word counts and verdicts

So if you did speak, it was better to say plenty!? Or in other words, more articulate defendants had a better chance of acquittal (though they were still slightly worse off than the silent).

Defences: average word counts and verdicts

Finish with focus on defendants’ defence statements – made by nearly all defendants who spoke and for the majority the only thing they did say (a minority questioned witnesses or made statements at other points in the trial).

overall word counts of defence statements * guilty (n=7696) average wc 44.97 * notguilty (n=1414) average wc 65.15

On average, defence statements by the acquitted were longer. Again highlights that more articulate defendants do better.

Also, there is more variety (less repetition) in the statements of acquitted defendants. 98% (1374) of their 1414 defence statements are unique (crudely measured, as text strings). Whereas 93.17% (7170) of statements by convicted defendants are unique.

Start to look more closely at what they say? Not possible yet to investigate in depth, but use some simple linguistic measures.

Defences: Words least associated with acquittal


In linguistics, keywords are “items of unusual frequency in comparison with a reference corpus”. Compared the larger set of defence statements by defendants who were convicted with defence statements by defendants who were acquitted

Table above is the words least likely to be associated with acquittal – ie, the least successful defence statements…

I want to highlight:

  • mercy + beg
  • picked (+ carry might be related)
  • i
  • distress

Remember that many defence statements were not really ‘defences’; they were more of an appeal to the judges’ clemency after sentencing (‘I beg for mercy’) or claiming extenuating circumstances (‘I was in distress’) in particular. Also Playing down offence – ‘I picked up the things’.

And in general many short bare statements beginning with “I” rather than more complex narratives.

Four hopeless short defences

So I picked four of the most frequent short (non-)defences that are heavily associated with convictions, to explore a bit further. (excludes use of any of these within longer defences)

defence frequency % convicted
nothing to say 109 98.17
mercy 125 98.40
picked up/found 223 93.72
distress 82 97.56

Main variants:

  • I have nothing to say
  • I beg for mercy/leave it to the mercy of the court/throw myself on the mercy of the court
  • I picked it (them) up/found it
  • I was in (great) distress/I was distressed/I did it through distress

The next four graphs show the percentage of defendant speakers who use each phrase in short defence statements in each decade.

I have nothing to say

This was very popular before 1810s – peaks at use by 4% of defendants who speak in decade 1801-10 and then rapidly disappears.

I beg for mercy/leave to the mercy of the court

Slightly later popularity – slower decline after 1810s

I picked it up/found it

Less dramatic decline after 1820s.

I was in distress/did it through distress

Curious that this doesn’t appear at all in 1780s; peaks 1810s.


So there are variations in timing/speed of decline, but broadly, these hopeless ’non-’defence statements, which are almost certain to be followed by conviction, are all declining in use and rarely heard in the courtroom after the 1820s. That fits, it seems to me, with both the gradual decline in defendant speech and the more rapid rise from the late 1820s of plea bargaining.

First, the defence lawyer option meant that defendants were better off finding the money for a lawyer who could try to undermine the prosecution case through aggressively examining witnesses. This was happening from the 1780s onwards.

And second, the plea bargaining option from the late 1820s meant that if defendants really had no viable defence, had been caught red-handed, they were better off pleading guilty in return for a less harsh punishment.

And so: for defendants who wanted to walk free or at least lessen their punishment, if not for later historians trying to hear their voices and understand what made them tick, silence was golden.

More stuff

Settlement and Removal: Poor Relief and Exclusion in 18th-century London

From the Act for the Relief of the Poor of 1662, or so-called “Settlement Act” onwards, various pieces of 17th- and 18th- century legislation formally codified entitlement to parochial poor relief by “settlement“. The main ways of gaining a settlement of your own were: completing a formally contracted apprenticeship; at least one year in continuous service; renting a house worth at least £10 a year; paying parish taxes or serving as a parish officer. Many people’s settlements, however, were ‘derived’: a married woman from her husband; children born in wedlock from their parents. But illegitimate children got their settlement from their place of birth. And a new settlement erased previous ones.

In theory, everyone in England and Wales in the 18th century ‘belonged’ to a parish, somewhere. Which was fine… as long as you had a settlement in a place where you actually wanted to be. But the flip side of settlement was removal: exclusion was key to the workings of a locally-based poor relief policy.

The case study: St Clement Danes

This paper is early work in progress based on sources digitised by the London Lives project, exploring the narratives of the poor in examinations and petitions and linking together records to trace larger patterns. The focus here on one London Lives parish in the second half of the 18th century, St Clement Danes, a large urban parish to the west of the City of London, with a population of around 13,000 in 1801, of whom about 600 were receiving poor relief (costing the parish about £7000 p.a.). It was fairly well off, on average, with varied local trades and industries. But, as is often the case, averages hide considerable variation, with poor and rich living close together.

Overview of the data

I’m focusing on three sources from London Lives:

1) a dataset of settlement, bastardy and vagrancy examinations for St Clement Danes and another London Lives parish, St Botolph Aldgate, covering 1739-1800 (containing about 11000 exams in total). There were three main possible outcomes of a settlement examination: the examined was shown to have a settlement in the parish; their settlement was somewhere else but they could produce a settlement certificate guaranteeing that their own parish would relieve them, so they were allowed to stay; removal from the parish.

2) the second dataset is a Clement Danes register of removal orders (covering late 1752 to mid 1793; I chopped off the part-years at beginning and end for a convenient 40 year period).

Many archives have large numbers of surviving 18th-century pauper examinations, but records of removals are much less common. Linking the two means I can begin to examine more systematically the outcomes of examinations. For the period 1753-92, then, there are 5046 examinations and 2479 orders, of which 2357 could be linked to at least one exam. (Conversely, 2365 exams could be linked to at least one removal order.)

3) And finally, I’ve linked these to petitions in the Sessions papers from parishes appealing removals from Clement Danes.

1: annual counts of examinations 1753-92, broken down by type of exam

Figure 1 shows the three types of examinations: settlement, bastardy and vagrancy. The vast majority of exams in this series were settlement exams (green); bastardy exams (red) account for about 10% of the total. The vagrancy category is a very thin blue line at the bottom of just a few years; the numbers were tiny. There will have been more vagrancy exams than this, but they were usually recorded separately, often on pre-printed forms (which makes me a bit curious about the few that do turn up in this series – why are they here at all?).

2: annual counts of removal orders linked to exams, 1753-92, by type of order

In Figure 2 we can see there are two types of removal order in the register: non-specific orders I’ll simply call ‘pauper removals’ and vagrant removals (sometimes called passes, as the removed were “passed” to their destinations). Most of the orders that couldn’t be linked to examinations were vagrant removals, again indicating that vagrant examinations were recorded separately. But this graph shows that a striking proportion of settlement exams ultimately resulted in vagrant removal orders, highlighting the fuzzy boundaries between the poor laws and vagrancy laws.

Generally, parishes had an incentive to do this because they had to foot the bill for pauper removals, while the county paid for vagrants to be removed. It looks suspiciously as though this was getting rather out of hand in the mid 1750s. In the spring of 1757 the bench at Middlesex Sessions was very concerned about the numbers of vagrants and costs of removals. In July they appointed a new contractor to handle the removal of vagrants and (in what looks to me at least very much like a slap on the wrist to negligent JPs) ordered that JPs were “not to sign any Vagrant Pass” without proof “that an Act of Vagrancy hath been committed”. There was a dramatic and immediate impact in St Clement Danes: of 75 vagrant removals in 1757, only 7 were dated after July.

Even so, vagrant removals continue to be quite conspicuous in comparison to examinations; so I want to look more closely at the settlement exams that wound up in vagrant removals to see if there’s any real justification beyond financial expediency. (The smaller increases in vagrant removals after 1757 do generally match years which have vagrant examinations, but they only partially correlate to the years with the largest numbers of exams.)

Overall, about 50% of settlement examinations led to removal orders. The main concern of bastardy exams was establishing paternity rather than settlement (though occasionally a single exam covers both topics), and so there are much lower linkage rates between these and the orders (about 20 of 500+ could be linked to orders). However, the women examined in bastardy exams often have later settlement exams as well, and so I have some more linkage work to do to establish whether more than the 20 were actually removed at some point.

3: gender in settlement/vagrancy exams, 1752-93

As in many other studies (and even after ignoring bastardy exams), by far the majority of examinants and the removed were women. They averaged 75% of the examined over the period, reflecting women’s vulnerability to poverty. However, women were slightly less likely to be removed than the male examinants, and it’s possible that there’s some correlation between the peaks in exams/removals and higher rates of male removal. The differences are not large; but I have some more number-crunching to do here.

Contesting exclusion

I want to look now at cases in which examinants returned to the parish after being removed by an order. Between 1753 and 1792, at least 122 examinants were removed more than once. This happened in two ways: first, examinants were returned after the receiving parish disputed the case; second, the examinant themselves might reject the magistrates’ authority and return of their own volition. [Links to documents for each of the cases mentioned are listed at the end of the post.]

The returned

Parishes to which paupers were removed had a right of appeal to Quarter Sessions (or to the Court of Aldermen in the City of London). Between 1750 and 1800, there are about 2300 petitions of this kind in the Middlesex, Westminster and City of London Sessions papers in London Lives. Individual parishes did not appeal many removals: the process was very expensive; and it’s argued that in London many parishes had informal agreements to accept paupers from each other (‘friendly passes’).

I’ve found 44 appeals against removals from Clement Danes between 1753-1792,  all of which can be linked to examinations and/or removal orders. The petitions themselves are usually uninformative about the reasons for appeal (unlike many other petitions in the Sessions Papers). But the linked examinations can be more revealing.

In 1760, the parish of St Brides appealed against Clement Danes sending them an 8 year old girl, Mary Ives. Mary’s mother was dead and she’d been abandoned by her father James, whose settlement was unknown. Mary had been born in St Brides; but she was legitimate so that was irrelevant to her settlement. So it’s not surprising that their appeal was upheld and Clement Danes had to take Mary back.[1]

Or, in 1762, St Sepulchre’s appealed CD’s decision to send them Susanna Flood, the widow of Noah Flood, and their three children. According to the settlement examinations, Noah had only served 5 years of his Apprenticeship in St. Sepulchre and the final two years with a different master in Hornsey. Again, the appeal was successful. Within three months Susanna and her children had been dispatched to Hornsey instead.[2]

On the facts of the exam, sending Mary Ives to St Brides seems simply opportunistic; the JPs must have known perfectly well that there was nothing in the examination to support this course of action. The most charitable interpretation is that they had some reason to believe her father might turn out to have a settlement in St Sepulchre, and so the recipients would not bother to appeal. Equally, I’m sceptical that the right course of action in a case like Noah Flood’s (though clearly not entirely straightforward) wasn’t well established and known to JPs by the 1760s. Both cases seem to suggest that getting rid of unwanted paupers as quickly as possible could take priority over establishing the facts of uncertain cases. And yet, if that were really the case, we might expect appeals to be rather more frequent than they actually were.

[Oops: on reading the Flood examinations again, I looked more carefully at the dates, and realised I castigated the JPs unfairly: Susanna’s first examination only mentioned the St Sepulchre apprenticeship and it wasn’t until she was examined again after the appeal that she completed the narrative.]

But one more caveat. Of the 44 linked petitions, 15 (14%) were in just one year, 1785. The mid- to late-1780s were busy years for examinations and removals in Clement Danes. The 1785 Sessions Papers are unusually full of parish petitions – but so are those for 1784, and that year’s files contain no appeals against Clement Danes at all. What is going on?! Survival rates of documents, including petitions, in the Sessions Papers are variable and uncertain, but this is a very curious anomaly.

The returners

In all this, the interests and desires of the paupers themselves are clearly the lowest priority of all. (As evidenced by the way in which parish officials were apparently quite happy to label significant numbers of them vagrants – a criminal offence, remember – in the 1750s, simply to save some money on removal costs.) But we can, sometimes, begin to trace something of what the poor wanted for themselves.

Some examinants gave accounts that investigation rapidly proved to be false. Thomas White’s claim in 1769 to have a settlement in CD based on 2 1/2 years service was “On Enquiry found… to be false [, the master] never having kept house a Twelvmonth in the Parish & the Examinant only an Earrand Boy for a little time”. If Thomas lied because he didn’t want to leave the parish, the tactic may have worked: there’s no sign of a removal order.[3]

Challenging the authority of the magistrates and law by returning after a removal order was a risky business; returners could be labelled as vagrants and subject to the harsher penalties of the vagrancy laws. Nonetheless, some returned several times over years or even decades.

Ann Brown, a single woman aged around 40 in 1755, had been a servant to a Mr Champ in Oxford for about 18 months during the mid 1740s. There was no doubt about her settlement: she gave almost exactly the same account to the CD magistrates four times between 1751 and 1757. The first occasion pre-dates the removals register but on each of the subsequent times they sent her back to Oxford as a vagrant. An order in 1755 describes her as “an incorrigible rogue”, which had a specific meaning in the vagrancy laws: it referred to repeat offenders who could be more harshly punished, from imprisonment with hard labour potentially up to transportation to the colonies. In practice this was rare, but Ann would surely have been warned it could happen. And yet she came back again two years later. And while most repeat returners came quite short distances from other London parishes, each time she had to cover a 50 mile journey from Oxford.[4]

On her first examination in April 1758, Mary Jenkins appears to be just one of the many women who were examined about their settlement because their husbands had recently died, gone away to military service, been imprisoned, or simply deserted them. Her husband Henry was at sea and they had 3 young sons. After their marriage, Henry had rented a house in St Olave Southwark at an annual rent of 11 guineas, so the CD JPs had Mary and her young sons removed there: a straightforward case. But Mary returned to CD four times, only to be removed again. Again her motives are unknown.[5]

In these cases it seems to me there must be some connection to the parish that would not be documented in settlement examinations, but whether I can trace records that might shed light on them, I don’t know. Tantalisingly, an Ann Brown was baptised in Clement Danes in 1713; unfortunately, it’s a common sort of name and there were quite a few Ann Browns born in Middlesex (let alone anywhere else) in a reasonable date range. Conversely, in what I’m fairly sure is Henry Jenkins’ and Mary’s marriage record, her maiden name is transcribed as “Rouffinee”, an apparently unique surname (this could be either a transcription error or an unusual spelling of, perhaps, an Irish name like Roughneen?).

Irregular unions and family breakup

And I want to close with a case highlighting the themes running through many examinations of marriage breakdown, ‘irregular’ unions and their implications, the potential for paupers to be excluded not only from parishes but from their own families.

Ann Threader was examined in February 1785. She had married John Threader ‘about 30 years ago’ at the Fleet (I think actually in 1750), and he deserted her just two months afterwards. She had never seen him again but had heard that he re-married, and that he had later died. A few years after he left her, she moved in with Jacob Wesley, a shoemaker, with whom she had three illegitimate children, aged between 9 and 14 at the time of the exam. Because Ann and Jacob had moved house during their relationship, their children had been born in two different parishes in Southwark. CD attempted to remove the children to those parishes, but both removals were successfully appealed at the next Middlesex Sessions.

This time the examination itself sheds no light on the grounds for appeal, but it has a marginal note that CD were ‘obliged’ to take the children because they had already ‘been passed to us some time back’. Whatever the reason, CD subsequently relieved the children, although they quickly had the two older children bound out as apprentices, and they also gave Ann occasional out-relief for some years.[6]

Following the failure of a marriage or long-term absence of a husband, cohabitation and (less often) bigamous re-marriage were both options to be found in settlement exams, and I want to explore this in more depth in the future. Just as with ‘regular’ marriages, the break-up of an ‘irregular’ union due to a partner’s death or departure could make the remaining family members vulnerable to exclusion. But with these unions, the settlement laws could  in theory result in the break up of an entire family: the illegitimate children to the parishes of their birth, the mother and father to separate parishes altogether.

Future directions

Because this research is in early stages I don’t have substantial conclusions yet, so instead a few thoughts on future directions.

The first strand relates to the experiences of the poor themselves, and how settlement strategies could go awry. People – perhaps especially poor people! – didn’t always live the well-ordered lives imagined by settlement law and there were many potential sources of dispute. Young people might not complete apprenticeships or service, for a range of reasons. (Apprenticeships were long and might well start in one parish and finish in another because of a master’s house move, death, bankruptcy or abuse of apprentices.) In any case, young adults didn’t always stay put after gaining a settlement of their own; they might move to find work, or return to their childhood homes, but never manage to gain another settlement. Elderly widows or young orphans could end up being sent to parishes they had never even visited because their husbands or parents had worked or lived there many decades earlier. Young people could be separated from the rest of their family because they had been born before their parents’ marriage. I want to explore these experiences in more depth, and those of the poor who resisted exclusion.

Second, there’s the larger context of poor law and settlement practice. Ann Winter and Thijs Lambrecht have recently argued for the importance of investigating local experiences and variations in settlement practice, and I think Jeremy Boulton has brilliantly shown the value of detailed record linkage in a local case study, for St Martin in the Fields. In the late 18th century, Clement Danes had a reputation as a parish where migrants could go to claim poor relief without too much scrutiny by parish officials – a “casualty parish” (indeed, the best casualty parish!). I’m curious, among other things, how accurate that image was. (One thing I do know already is that Clement Danes removal rates were considerably higher than those in St Martins a few decades earlier in the century. The numbers of examinations in Clement Danes are also much higher than those in the St Botolph Aldgate records, though they had roughly similar size populations.) How consistent was practice in Clement Danes, and how did it match up to settlement law? In reality, how likely were widows or abandoned wives or illegitimate children likely to be despatched to far-off parishes? And how does it compare to other London parishes?

This is a slightly revised version of a paper delivered at Cultures of Exclusion in the Early Modern World, University of Warwick, May 2017.

London Lives documents

[1] Mary Ives

[2] Susanna Flood and family

[3] Thomas White

[4] Ann Brown

[5] Mary Jenkins and family

[6] Ann Threader and family

  • Examination, 1785
  • Removal orders
  • Petitions: St Georges Southwark and St Mary Overys
  • Apprentice register for Thomas and Hannah Threader:
  • Example of relief to family members (in the form of clothing)
  • (FindMyPast/FamilySearch) marriage of John Thredder to Ann Clark, 28 Feb 1750, London; FamilySearch also has a Fleet marriage record for John Thredder on the same date. Despite the date discrepancy, the match seems likely (35 years is a long time…)
  • (FMP/FamilySearch) marriage of John Thredder to Mary Poore, St Martin in the Fields, April 1763; burial of John Threader, St Martin in the Fields, 20 Jan 1772. But there is also a burial record for a John Threader in 1764 at St Ann Soho, so can’t be certain that the St Martins records are the right man.

Additional reading

London Lives: Poor Law

Zotero bibliography (work in progress)

Noted in particular:

  • Jeremy Boulton, “Double Deterrence: Settlement and Practice in London’s West End, 1725-1824”, in Migration, Settlement and Belonging in Europe, 1500–1930s: Comparative Perspectives, edited by Anne Winter and Steve King, 54–80. New York: Berghahn, 2013.
  • Norma Landau, “The Laws of Settlement and the Surveillance of Immigration in Eighteenth-Century Kent.” Continuity and Change 3, no. 03 (December 1988): 391. doi:10.1017/S026841600000429X.
  • A. Winter and T. Lambrecht, “Migration, Poor Relief and Local Autonomy: Settlement Policies in England and the Southern Low Countries in the Eighteenth Century.” Past & Present 218, no. 1 (February 1, 2013): 91–126. doi:10.1093/pastj/gts021.

Further information

London Lives Paupers and Petitioners project

All the data for this paper is shared under Creative Commons licences and can be downloaded:

(The removal orders dataset includes the data for the linkage between the three sources as used in this paper.)

Remixing and Remaking Digital History: the London Lives Petitions

For those of you who like such things, this post explores the rationale and methodology for my work on London Lives Petitions: it’s a revised/extended version of my paper at the Digital Humanities Congress, September 2016, in the session on Adding Value: Challenging Practical and Philosophical Assumptions in the Digitisation of Historical Sources. You can also find the slides here (pdf).


If there’s one assumption I’d like to challenge here, it’d be that the digitisation of historical sources is all about this sort of thing:

some websites

Working on massive online history resources (Old Bailey Online, London Lives, Connected Histories, The Digital Panopticon, et al) has been keeping me occupied for the last 10 years; I very much like that state of affairs and I believe deeply in the importance of the work my colleagues and collaborators and I do at the Humanities Research Institute. But I also believe that online resources are just one facet of the digitisation of history. Our work should be a beginning, part of ongoing dialogues, adaptations and conversions, not the final word.

The good news is that both Old Bailey Online and London Lives data have been getting remixed almost as long as they’ve existed. For example, they were included in the federated search project Connected Histories, the GIS project Locating London’s Past, and in our current, massive record linkage project Digital Panopticon. And not just by us: there is the Old Bailey Corpus, and Old Bailey Online is also in the federated search sites 18thConnect and NINES. It’s also recently started attracting the attention of mathematicians and statisticians and this year has been used as a resource in a course on Scalable Data Science.

Re-use of the London Lives data outside our own domain is much less extensive, but parts of it have been used by Adam Crymble, Tim Hitchcock and Louise Falcini for a project and dataset on 18th-century vagrant lives (which we’re including in Digital Panopticon). And in fact it was their project and approach to data sharing that really got me thinking about the possibilities of remixing London Lives data on a smaller scale than our huge collaborative (and hugely funded) projects: extracting and reshaping sub-sets of data that are more manageable but nonetheless too large (tens of thousands of records rather than hundreds or low thousands) to work with entirely by hand. The London Lives Petitions Project (hereafter LLPP) is one of the results.

London Lives

London Lives is a major digital edition of a range of primary sources about eighteenth-century London, with a particular focus on the poor and crime. The project’s approach to digitisation was designed around an explicit research agenda: to foreground the lives and experiences of non-elite people and to de-emphasise institutions. More practically, the scale of the enterprise necessitated a pretty single-minded discipline to get it done. Of course we aimed to create a resource with more general usefulness, but those were the key conceptual and material factors underpinning source selection and setting the priorities for how sources would be digitised.

That meant: full text transcription followed by marking up in XML to make specific things searchable: the names of people and places, dates and occupations or social status, to facilitate nominal record linkage. And not (for example): paying detailed attention to institutional categories or structures, or cataloguing documents as archivists might do.

The result is a website (as the name suggests) that makes it easy to look for people, link together records about an individual’s life and even group related people together. But it can be harder to use for subjects that are related by other categories or themes. The keyword search is basic; there are no features to save and link, say, documents or places rather than names.

Also, the emphasis is on human judgment to make those links, and to answer questions like: what kind of person is this; how does this tagged name relate to other potentially relevant pieces of information in its vicinity? It’s been hard work to convert London Lives data for use in Digital Panopticon, which needs heavily structured name data for record linkage. So in the last couple of years we’ve been thinking a lot about ways to restructure, enhance, and build on the work we began a decade ago.

The other thing to note about London Lives is that we had to put a range of different kinds of records into a single framework, and some fitted better than others. Many of the records were bound volumes, coherent institutional products – registers, minute books, accounts, etc. A register for example is already quite structured, even when not tabular in layout; there is little ‘narrative’ language and you know what kind of info will appear where on each page.

But then there are the Sessions Papers.

18th-century Sessions of the Peace, presided over by magistrates, oversaw a wide range of administrative work in addition to criminal justice, including poor relief, trade and work regulations. They sat several times a year and after each meeting, the clerks would file assorted stuff from that session’s business into bundles that, ultimately, add up to a massive body of very diverse records. From three London courts (Middlesex, City of London and Westminster), London Lives has around 1250 session files (950 from the Middlesex Sessions, dwarfing all the rest) amounting to 86,000 document images, which include lists and calendars, witness examinations, petitions, court orders, accounts, and all sorts of miscellanea. (The Old Bailey Sessions Papers add another 13000 or so images but only about 20 petitions.)

Finding Petitions

Petitions are among the most common documents in those files: as it turns out, around 10,000 of them. Why are petitions interesting? The humble petition was everywhere in early modern Europe. Petitions were instigated by institutions, by groups, and by individuals, by elites and by paupers, and all sorts of people in between, direct appeals to powerful institutions or individuals to resolve a grievance or crisis. So they tell stories about lives and experiences; they aim to persuade, often to play off one source of authority against another. (work in progress bibliography)

The surviving documents are in many ways a pale shadow of the original interaction; we usually don’t know who actually wrote them, or how the voices of the petitioners might be filtered and mediated. Nonetheless, they have something to tell us about the agency of the governed and their relationships with and expectations of governments.

But also petitioners’ stories, however creative, had to conform to some formal conventions, employ certain forms of language. As a result, petitions form a potentially meaningful and findable textual corpus – if I could find the right strategies.

Just one example to underline why searching the London Lives website wouldn’t be that strategy (quite apart from scale!).

A keyword search of London Lives for ‘petition’ in Middlesex Sessions in the year 1690 returns 8 results, including 2 documents that are not petitions (although they are related). But the same keyword search and constraints in the current version of the LLPP dataset finds 11 petitions. And in total there are 66 petitions from Middlesex Sessions in that year.


[Confession time: I screwed up this example in the presentation; I said the total in LLPP for MiddS+1690 was 11, rather than 66. I somehow managed to forget the 11 results were only those including ‘petition’. Which is quite some difference. I thought it seemed low at the time…]

Why does the search miss so many? Many London Lives documents contain spelling variations, abbreviations, and not a few rekeying errors (which are not quite like OCR errors, but can cause similar problems for machine-readability). In fact, about one third of the LLPP petitions overall don’t contain a text string spelled ‘petition‘ at all. Others do, but only as part of a longer word (‘petitioner’, etc), which the London Lives search would only find with a wildcard search (which is unavailable at the time of writing).

I put the texts into the neat little linguists’ concordancing tool Antconc to get a wordlist, which indicates there are, in total, several hundred possible variants of words with the stem ‘petition’. In fact it’s not really as bad as that suggests, since there are a small number of particularly common forms (and often a petition text will contain slightly varying repetitions, so at least one of the common forms is likely to occur somewhere). The two endings -tion or -con will find 90-95% of petitions. So, I could handle this particular issue without too much trouble by searching with regular expressions.

But unfortunately that doesn’t deal with the problem of false positives. Many pages in the Sessions Papers that are not petitions contain ‘petition’ in some form: in fact if I simply search the entire Sessions Papers for ‘petition’ or ‘peticon’, my search will return more than 5000 pages that are not actually petitions (or in some cases, are continuation pages of multi-page petitions).

Keyword searching, extended with regular expressions, was a useful starting point for exploration, and it also highlighted just how many related documents the SPs actually contain – more than I think I’d realised. But I would obviously need a slightly smarter approach to identifying petitions.

So here’s a pretty typical petition, highlighting the formula parts of the document around the actual complaint of this petitioner. [I’ve already discussed how they work rhetorically in this earlier blog post but here I’m thinking about how they function as markers of document structure.]

Jane Browne's petition,
Jane Browne’s petition (1691), LL: LSMSPS500100091

The example shows the elements or markers that are common to petitions (notwithstanding various minor spelling/word order variations) and aid both identification and location of start and end of the petition itself when there can be various annotations before and afterwards, including signatures:

  1. start (1): “To The Right Honourable/Worshipful/similar title…” [After “to the”, this line can be very variable; and also it’s quite often missing or damaged]
  2. start (2): “The humble petition of” [appears in the majority of petitions; ‘humble’ is sometimes omitted, and there can be a lot of small but annoying spelling variations]
  3. start of the main body of petition: “Humbly Sheweth that” (again, ‘humbly’ is optional).
  4. additionally, it’s worth noting that in the body of the text petitioners almost always refer to themselves in third person: “your (humble/poor) petitioner(s)”. ‘Humble’ and ‘humbly’ will appear somewhere along the line.
  5. the ubiquitous ending (though again it can have quite a lot of small variations): “And your petitioner(s) (as in duty bound) shall ever pray etc

So there’s plenty there to track them down much more reliably (and, moreover, to identify their component parts), making it possible to let the computer find the bulk of easy ‘typical’ petitions and definite ‘not’ petitions, leaving a smaller set of ‘maybes’ for more manual sifting: a few hundred, rather than several thousand.

And there are plenty of petitions that depart from the “typical” model to some degree: they might omit, or truncate, some of the expected conventions, use particularly idiosyncratic or archaic spellings, or have been penned by scribes whose handwriting was less than fluent (which is in turn likely to affect accuracy of rekeying).

Loyal readers of this blog may recognise this example:

The petition of Ester Cutler (1715). LL LMSMPS501460090
Ester Cutler’s petition (1715). LL LMSMPS501460090

(The phonetic “sh” spelling in petitioner is really unusual: it appears in just 8 petitions in LLPP. The entire petition is full of equally unusual spellings, and I’m pretty sure Ester wrote her own petition – the signature matches the rest – which is also very rare.)

In the end, the “non-typical” only amount to around 4-5% of petitions. But they are a little different from the rest. They skew towards the first half (and possibly the first quarter) of the 18th century (as do variant spellings of ‘petition’), and towards petitioners I’m particularly interested in, lower-status individuals and women. Not perhaps by much: women make up 20% of identifiable petitioners in the ‘typical’ 95%, and 25% in the non-typical 5%; a small number overall, but for me, doing women’s history, finding those extra 100-odd women, like Ester, is quite a big deal.

Besides, at the very beginning, it wasn’t clear just how many non-typical petitions there would be – it could have been nearer 5000 than 500 (mind you, then I’d have been looking for a different method!). But it didn’t take long to establish that they would be a relatively small number, and I do think that in a different context – if this work had been part of a much larger project working to tight deadlines – it would be a valid decision not to spend substantial amounts of time sifting manually to find those hard cases – as long as you were transparent about your methods and their limitations. But for my own purposes, and my own satisfaction, I could weigh up that choice differently – as long as I remember that, however much I’m drawn to petitions like Ester’s, they are atypical. (And I do at least know in what ways they’re atypical, and can quantify that difference.)

So, having got this data…

What Now and Next

Remix Culture?
Remix Culture? Data transformations

A key element of the project has been sharing and documenting the data and the research in progress:

Firstly, the open data contains some fairly basic metadata for the petitions and the corpus of plain text files. (This has been released in stages; I deliberately put some initially very rough work in progress out there for a couple of reasons. I’m as prone as any historian to getting bogged down in perfectionism; making public much less than perfect data is slightly painful, but creates incentives to improve it rather than keep hiding it away. I think it’s also a practical way of emphasising how data creation is a process rather than an event, and underlining the importance of versioning and documentation.)

There has also been further processing of the data for analysis (some of which will end up in the open data):

a) work on the petition texts, primarily “VARDing” and trimming. VARD is a great tool: like a spell checker, but for early modern English. It’s trainable, though I was impressed at its accuracy straight out of the box. It makes mistakes; I wouldn’t use it to “correct” transcriptions; but it’s ideal for making a more regular version of the data for textmining and quantitative analysis. VARDing was followed by stripping out annotations, signatures and so on at beginning and end of petitions, for example to enable analysis of petition lengths (nb: link is a dataviz that may be slow to load).

b) work on improving the metadata, especially

  1. separate individuals’ petitions from institutional (especially ‘parish’) ones
  2. using the existing London Lives name tagging to identify petitioners and start linking petitions to related records
  3. in particular, I want to link petitions to related documents in the SPs, especially orders, so that I can examine responses – these don’t exist for all petitions but there do seem to be a lot more than I initially was aware of; as well as to related records elsewhere in London Lives, like pauper examinations .

Finally, something I’ve not really got very far with yet: identifying what petitions are about and exploring meanings. (Some early attempts at topic modelling didn’t work very well, another reason I needed to create the VARDed and trimmed version of the data.) Other sessions on text analytics and linguistic tools at the conference gave me new ideas, although this still feels like a whole new and slightly intimidating challenge.

Concluding Thoughts

“Remixing” digitised history is something that historians do all the time, when they search online resources and copy whatever results seem relevant into their own spreadsheets and databases. But I’m not sure that they’re always doing it with the best tools for the job, or with the critical understanding they need of those resources and their limitations. Laborious “search-select-copy-paste” is fine if a resource is simply a supplement to your main sources. It becomes less appropriate if the resource is your main source, you’re using it on a large scale, or you intend to make quantitative (including implicitly quantitative) arguments based on the results. It is possible to use online search critically, but difficult without some knowledge of the underlying sources, the ability to compare different resources for the same material, and/or the time and willingness to explore different searches and methodically compare results (for a brilliant example, see Charles Upchurch, ‘Full-Text Databases and Historical Research: Cautionary Results from a Ten-Year Study’, J. Soc. Hist, 2012 [link]).

On the other hand, self-conscious digital historians (and digital humanists) are making strong critiques of online search as a methodology. “Search struggles to deal with what lies outside a set of results”, as Stephen Robertson points out. Ted Underwood argues, similarly, that “Search is a form of data mining, but a strangely focused form that only shows you what you already know to expect”.

But it seems to me that Digital Humanities-based answers to this problem often focus on the application of advanced distant reading techniques to the interpretation of Big Literary Data. I think those critiques and techniques are vitally important, but even so, the usefulness of learning how to employ them can seem rather less obvious to a social historian grappling with creating some usable research data out of digitised forms of the archival detritus of governance than to those lucky bastards screwing around with a million books (pdf). Miriam Posner has argued that what many digital humanities scholars really need, before they can get on to the fun stuff, is a lot more help with and tools for finding, cleaning and modelling data. (As she says, and Adam Crymble reminded us at the session, ‘that garbage prep work‘ is Digital Humanities too!)

The London Lives Sessions Papers can in one sense be considered big data in that there’s too much material for a person, realistically, to read all of it manually (and make sense of it). But they aren’t Big Data like a million books is Big Data. And having eventually made my dataset, I certainly want to try out that kind of analysis on the petitions texts and explore what’s possible; but I also need to do nominal record linkage and to study petitioners. My methodology for discovering petitions has been, when you get down to it, an extended kind of search. But at least for working on data creation at this sort of medium-to-biggish scale, once you’re freed from the constraints of a large database optimized for web delivery, you can get a long way screwing around with search.