WHM18: Middlesex Vagrants in the 18th century

My final data visualisation post for this Women’s History Month is back in the 18th century and takes a look at an open dataset, Vagrant Lives: 14,789 Vagrants Processed by the County of Middlesex, 1777–1786, which was created by Adam Crymble, Louise Falcini and Tim Hitchcock, using data from London Lives.

This dataset makes accessible the uniquely comprehensive records of vagrant removal from, through, and back to Middlesex, encompassing the details of some 14,789 removals (either forcibly or voluntarily) of people as vagrants between 1777 and 1786. It includes people ejected from London as vagrants, and those sent back to London from counties beyond

They’ve already written about this data in an excellent article (open access) and Crymble has blogged further about his ongoing research. (They have better visualisations too, so you could skip this post entirely and go to the real thing. Think of this as a taster.)

I want to focus on ways of visualising multiple categories of qualitative information – the more categories you want to compare at the same time, the more complex a dataviz has to be. In this case, I’ve got four categories to play with: gender, dates, countries of origin, and vagrant ‘types’. That’s to say, there are three types of individual in the dataset: leaders of family groups, their dependents, and single vagrants. The gender of the majority of dependents is unknown (most are children), so for most of this post, I decided to simplify things by filtering out all of the dependents to focus on the group leaders and singles. (As a result, because I’m ignoring about 500 wives who were counted as dependents, the following will differ somewhat from the work referenced above.) This resulted in 10963 individuals.

Overall, the gender ratio of the vagrants looks almost perfectly balanced (5438 female to 5525 male). But this hides some interesting variations.

Figure 1: Bar chart comparing the numbers of male and female lead/single vagrants.

Firstly let’s break it down by the year of the case. (There are some missing records, and the very small numbers in 1777 and 1779 in particular are due to these gaps.) Two things stand out: the numbers of both female and male vagrants rise rapidly in the mid-1780s; and women are in the majority each year until 1782, after which they’re overtaken by men.

Figure 2: Bar chart showing numbers of male and female lead/single vagrants in each year.

Now looking at vagrant type. As soon as you have multiple categories, you can split up the data in different ways – the “best” can depend on the data and exactly what it is you want to show. So graph 3a compares the percentages of male and female vagrants for each vagrant type, whereas graph 3b shows the percentages of group and single for each gender. 3b highlights that the majority were single individuals – something you wouldn’t know at all from 3a. It also makes it clear that vagrant type was gendered – considerably more men than women were singles. 3a, on the other hand, is better if you want to know exactly what the proportions of men and women were in each type. Most often, if I had to pick just one of these, it’s likely that I’d plump for 3b, because I’ve already seen that overall there are very similar numbers of men and women. But it might be a harder choice if that weren’t the case.

Figure 3a: Stacked bars comparing the proportions of men and women for each vagrant type.
Figure 3b: Stacked bars comparing the proportions of vagrant types for each gender

Now, looking at country of origin (British and Irish vagrants only, as there were only a few from other countries ), further striking differences emerge. It’s hardly surprising that the majority of the vagrants came from England, but much more noteworthy that there was such a large disparity between Irish men and women.

Figure 4: Stacked bars comparing origin countries for men and women.

Adam Crymble discusses what’s most likely going on, and it ties in with the particularly rapid increase in the numbers of male vagrants from 1783 shown in graph 1 – it’s probably the result of demobilisation after the American wars.

This says ‘demobilisation’ to me, and the male nature of most Irish vagrants suggests that this may have been a strategy for getting home after the war. Demobilisation was heavily centralized in London. Soldiers and sailors weren’t taken home; they were dropped off and left to find their own way.

Finally, I want to visualise the relationships between three categories in the data: gender, country and vagrant type. Mosaic plots are a more complex and less commonly used type of visualisation that can cram a lot more information into a single chart than you can with a bar chart. But, as with boxplots, that makes them a bit harder to interpret.

Figure 6: Mosaic plot of lead/single vagrants’ gender, country and type

Imagine that you start with a single large rectangular block. For your first category, you divide it horizontally, and put the labels for each “level” (in this case there are two, F and M, for gender) on the left hand Y axis. As in the very first bar chart, we can see that the proportions of men and women are close to equal.

Then you sub-divide the two blocks vertically for your second category (country) and put the labels along the top X axis. So reading left to right along each gender block, the first vertical block = English, the second = Irish, third = Scottish and fourth = Welsh. Again, we can see that English vagrants are in the majority for both genders, and at the same time, how a much higher proportion of the men are Irish.

Finally you sub-divide the blocks once again, horizontally, for the third category (vagrant type), and the labels for these (group and single) go on the right hand Y axis. The biggest single category, then, is women from England who are single (Hitchcock et al argue the importance of short-distance female migration London to find domestic service for making up much of this). The smallest category is men from Wales who lead a group.

Male Irish and Welsh vagrants are more likely to be single than are men from England and Scotland, whereas a higher proportion of Irish and (even more so) Scottish women were heading groups. (Crymble has also emphasised how different the Irish and Scottish vagrants were.)

The use of colour and shading adds one final dimension, but it’s harder to interpret on first sight. The idea is to show statistical significance. What it boils down to is that blue means the square is bigger than would be expected by the statistical model; red means it’s smaller than the model would expect (and the darker the colour, the bigger the significance). The fact that the group-Irish-male box is coloured dark red (ie, smaller than “expected”) pretty much seems to reinforce what we’ve already observed. The group-Scottish-female box also stands out among the smaller blocks – suggesting that this is significant and might be further investigated.

However, it’s important to to understand whether what the statistical model “expects” is appropriate for the data we have. In medical research, where data collection is conducted according to carefully defined rules, it may be possible to be confident that a statistical significance means a “real” difference. For a historian it might simply be pointing to imperfections in the data!  So it’s essential for historians doing data analysis and visualisation to get to grips with both the original sources and the statistics. I’m still grappling with the second part…

Data and code on Github.

More about Mosaic plots and their interpretation:

Advertisement

WHM18: Women’s heights in the Digital Panopticon

I’ve recently been working on the Digital Panopticon, a digital history project that has brought together (and created) massive amounts of data about British prisoners and convicts in the long 19th century, including several datasets which include heights for women. Adult height is strongly influenced by environmental factors in childhood, one of the most important being nutrition. So,

The height of past populations can thus tell historians much about the conditions that individuals encountered in their formative years. Given sufficient data it is possible to glimpse inside households in order to piece together a history of the impact that declining wages, rising prices, improvements in sanitation and diminishing family size had on mean adult stature.

However, many studies of height and nutrition in 18th- and 19th-century Britain focused on military records and therefore had little to say about women. The turn to  using the rich records of heights for men and women (and children) in 19th-century penal records has been more recent.

Today’s post is going to look at height patterns in four Digital Panopticon datasets, mainly using a kind of visualisation that many historians aren’t familiar with: box plots. If you’ve seen them and not really understood them, it’s OK – I didn’t have a clue until quite recently either! And so, I’ll start by attempting to explain what I learned before I move on to the actual data.

A box plot, or box and whisker plot, is a really concentrated way of visualising what statisticians call the “five figure summary” of a dataset: 1. the median average; 2. upper quartile (halfway between the median and the maximum value); 3. lower quartile (halfway between the median and minimum value); 4. minimum value; and 5. maximum value.

Here’s a diagram:

The thick green middle bar marks the median value.  The two blue lines parallel to that (aka “hinges”) show the upper and lower quartiles.  The pink horizontal lines extending from the box are the whiskers. In this version of a box plot, the whiskers don’t necessarily extend right to the minimum and maximum values. Instead, they’re calculated to exclude outliers which are then plotted as individual dots beyond the end of the whiskers.

So what’s the point of all this? Imagine two datasets: one contains the values 4,4,4,4,4,4,4,4 and the other 1,3,3,4,4,4,6,7. The two datasets have the same averages, but the distribution of the values is very different. A boxplot is useful for looking more closely at such variations within a dataset, or for comparing different datasets, which might look pretty much the same if you only considered averages.

These are the four datasets:

  • HCR, Home Office Criminal Registers 1790-1801, prisoners held in Newgate awaiting trial (1226 heights total, 1061 aged over 19)
  • CIN, Convict Indents 1820-1853, convicts transported to Australia (17183 heights, 14181 over 19)
  • PLF, Female prison licences 1853-1884, female convicts sentenced to penal servitude (571 heights, 535 over 19)
  • RHC, Registers of Habitual Criminals 1881-1925, recidivists who were under police supervision following release from prison (12599 heights, 12118 over 19)

For each dataset, I only included women who had a year of birth, or whose year of birth could be calculated using an age and date, as well as a height. (I say “heights” above because I can’t guarantee that they are all unique individuals; but nearly all of them should be.) In all the following charts I’m including only adult women aged over 19.

Here’s what happens when you plot the heights for each birth decade in RHC.

(This is generated using the R package ggplot2 , and it looks a little bit different from many examples you’ll see online because ggplot has a nice feature to vary the width of the boxes according to the size of the data group.)

The first thing I look for is incongruities that might suggest problems with the data, and on the whole it looks good – the boxes are mostly quite symmetrical and none of the outliers is outside the realms of possibility (the tallest woman is 74.5 inches, or 6 foot 2 1/2, and the shortest is 48 inches), though I’m slightly doubtful that there were women born in the 1800s in this dataset, which gets going in the 1880s; still, they’re a very small number so unlikely to skew things much overall. Since the data seems to be OK on first sight, the interesting thing to note here is that from the 1850s onwards, the women are getting taller, and those born in the 1890s are quite a lot taller than the 1880s cohort. This is fairly consistent with Deb Oxley’s (more fine-grained) observations of the same data.

Here’s CIN:

Again, we have a reasonable spread of heights and fortunately very small number of slightly questionable early births. (It happens to be the case that this data was manually transcribed, whereas RHC was created using Optical Character Recognition – but on the other hand, the source for RHC was printed and much more legible than the handwritten indents.) Ignoring for now the very small groups before the 1770s, the tallest decade cohort of women in this data is those born in the 1790s and thereafter they get consistently shorter.

Let’s put all four datasets together! (click on the image for a larger version)

I’ve filtered out women born before 1750 and after 1899, because the numbers were very small, and some extreme outliers (more about those later…). Then I added a guideline at the median for the 1820s (the mid-point), as I think it helps in seeing the trends.

It might seem surprising at first that the late 18th-century women of HCR are taller than any subsequent cohorts until the 1890s. Yet the trends here are broadly consistent with the pioneering research by Roderick Floud et al on British men and boys between 1740 and 1914. They argued “that the average heights of successive birth cohorts of British males increased between 1740 and 1840, fell back between 1840 and 1850, and increased once again from the 1850s onwards” (Harris, ‘Health, Height and History’). The British population was less well-fed for much of the 19th century (as food resources struggled to keep up with rapid population growth), and it got smaller as a result. Our women’s growth after 1850 may be slower than for the men (until the 1890s) though; perhaps it took longer for women than men to start growing again.

Finally, though, I have to put in a big caveat about the HCR data. I mentioned that I excluded some extreme outliers from the chart above. HCR was by far the worst offender, and if you look closely at the 18th-century cohorts covered by HCR, the boxes aren’t quite as symmetrical as the 19th-century ones. If we visualise it using a histogram (another handy one for examining the distribution of values in a dataset), we can see more clearly that there’s something up. A ‘normal’ height distribution in a population should look like a “bell curve” – quite tightly and symmetrically clustered around the average. CIN and RHC are close:

But this is what HCR looks like. This is not good.

If we’re lucky, much of the problem could turn out to be errors in the data which can be fixed. After all, it’s at least roughly the right kind of shape! The big spike at 60 inches (5 feet) rings plenty of alarm bells though. It looks reminiscent of a problem we have with much of the age data in the Digital Panopticon, known as “heaping“, a tendency to round ages to the nearest 0 or 5 (people often didn’t know their exact dates of birth). The age heaping is very mild in comparison to this spike, so I think it could well be another issue with either the transcription or the method used to extract heights. But if it turns out that’s not the case, this could be pretty problematic. We’re assuming the prisoners were properly measured, but we don’t know anything about the equipment used. For all we know, it might often have been largely guess work. In the end, we might find that HCR simply isn’t reliable enough to use for demographic analysis. There’s very little height data for women born in the 18th century, so this is a potentially really important source. But what if it’s not up to the job?

Data on Github.

Further reading

John Canning, Statistics for the Humanities (2014), especially chapter 3.

Introduction to Statistics: Box plots

The Normal Distribution

H Maxwell-Stewart, K Inwood and M Cracknell, ‘Height, Crime and Colonial History’,  Law, Crime and History (2015).

Deborah Oxley, David Meredith, and Sara Horrell, ‘Anthropometric measures of living standards and gender inequality in nineteenth-century Britain’, Local Population Studies, 2007.

Deborah Oxley, Biometrics, http://www.digitalpanopticon.org (2017).

Bernard Harris, ‘Health, Height, and History: An Overview of Recent Developments in Anthropometric History’, Social History of Medicine (1994).

Jessica M. Perkins et al, ‘Adult height, nutrition, and population health’, Nutrition Reviews (2016).

WHM18: Women’s Army Auxiliary Corps

Today I want to go on an excursion in “catalogues as data“. The UK National Archives’ Discovery catalogue is an excellent resource for this activity, because a) it has a lot of records that have document descriptions at ‘item’ or ‘piece’ level in the catalogue, containing quite structured information (like dates, places, occupations) that can be quantified and visualised; and b) even more importantly,  it has an export function that allows you to download up to 10,000 records in CSV format. (It also has a full API for those with some programming skills, but 10,000 records will get you a long way, and you can often break up larger collections into chunks, eg with date filters).

You’ll need to use the Discovery advanced search quite carefully to get the right set of search results (it enables specification of particular records, dates, catalogue level, etc) – there are some useful tips here. Then you’re quite likely to need to use a tool like OpenRefine to separate out pieces of information into separate data fields and clean/normalise dates etc (check out this tutorial).

I wandered around TNA’s online guides for some records about women that I know nothing about, and the Women’s Army Auxiliary Corps 1917-20 (WO 398) caught my eye:

the service records of more than 7,000 women who joined the Women’s Army Auxiliary Corps (WAAC) between 1917 and 1920… The WAAC became the QMAAC in April 1918 and was disbanded in September 1921

At 7000 records, this sounded like a good size set to play around with, well within the download limits. And a look at a catalogue entry showed that it has some nice information beyond women’s names (unlike a similar and larger series, WO399, which has only transcribed names). Given just a few hours work extracting and cleaning the data, what could I learn?

Record for Aaron, Sarah Ann nee Phillips
Place of Birth: High Street Cefn Mawr, North Wales
Date of Birth: 22 August 1894

First, what does this actually offer in terms of usable data? The date of birth is an obvious one: closer inspection shows that it’s in a consistent format where there’s a full date (the majority); at least a year is provided in almost every case, and that can be extracted into a standard year of birth field quite easily. Place of birth also has potential, but it’s more varied and needs more cleaning, so I haven’t done anything with that yet; but it could make for an interesting mapping exercise. Less obviously perhaps, “nee Phillips” suggests that – if you can safely assume women always gave this information! – it’s possible to also infer something about whether a woman was (or had been) married. Another nice little thing you could also potentially do, given birth dates and first names, is to look for patterns in baby naming (although this might really need a larger dataset).

Two caveats, one major and one more minor:

  • The online guide makes it clear that these 7000 records are only a small minority of the original collection (57000 records), as many were destroyed in a WW2 air raid. So it might not be representative of the women recruited.
  • Errors in the data – which you always have to look out for, even in the best quality material. In this case, there were a few obvious transcription errors in the birth dates. We can be 100% certain that birth years of 1822, 1917-18 and 1988 are just wrong. But actually more problematic are outliers that look unlikely but not quite impossible: 1844? 1903? Fortunately, they account for a tiny number of records. There were also 278 recorded as numbers like 18880 or 18930: I concluded that these were actually meant to be year dates to which somehow an extra zero had been added and corrected them accordingly.

Visualisation is often particularly useful for highlighting errors and problems in your data. But it’s the researcher who has to decide what to do about such anomalies (and whether they might even be serious enough to make the whole dataset too unreliable to be worth using).

I initially hoped that the record dates would represent specific dates when women joined up, but as it turned out there was only a covering date for the series as a whole. Since it only covers 4 years, that’s not really an issue; instead I simply worked out their ages in 1918 (assuming that there wouldn’t have been new recruits after the war ended anyway), and filtered out the half-dozen supposedly born before 1860 or after 1903.

And so the thing I learned today is that, gosh, they were so young.

As visualisations, tables may be less eye-catching than graphs, but they have the virtue of presenting a lot of precise information in a relatively small space; the table at the bottom of this post shows that more than 60% of the women were aged 25 or under in 1918 and about 90% were under 30. Very few of them were old enough to take advantage of the limited extension of voting rights to women at the end of the war.

This is confirmed by a bit of background reading – according to Lucy Noakes on Women’s Mobilization for War (Great Britain and Ireland), “the majority of recruits to the WAAC were young working class women”. If we can reasonably assume that the information given about maiden names is a complete record, or anywhere near it, the vast majority of the women were also unmarried – nearly 95% of them overall. I suspect that very few married women would have volunteered for this type of service (which was likely to take them overseas and close to combat), and as a result it might be expected that the majority would be young – very likely younger, on average, than male soldiers. You can also see that a considerably higher proportion of the women aged over 25 were/had been married – but it still looks a very low proportion compared to what you might expect in the general population (and I wonder if quite a lot of these were widows).

I’m not exactly surprised to learn from Noakes that their youth (and, no doubt, class) resulted in some negative perceptions:

In the public mind however, they were sometimes perceived as thrill seekers, drawn by a desire for adventure and romance, and recruitment to the service suffered from fears that women were finding opportunities for sexual liaisons with the soldiers. So worried was the government by these rumours that a Commission of Enquiry was formed, which included figures showing the number of pregnancies amongst unmarried members of the WAAC was lower than among unmarried civilians…

Queen Mary’s Army Auxiliary Corps (Art.IWM PST 13167):
“The GIRL behind the man behind the gun” (!) © IWM

Data on Github.

The ages of women recruited to the WAAC/QMAAC, 1917-18

Age in 1918 number of women % of total
15 1 0.0
17 4 0.1
18 427 6.1
19 865 12.4
20 737 10.5
21 774 11.1
22 710 10.1
23 624 8.9
24 506 7.2
25 396 5.7
26 325 4.6
27 242 3.5
28 234 3.3
29 177 2.5
30 150 2.1
31 129 1.8
32 108 1.5
33 98 1.4
34 79 1.1
35 48 0.7
36 63 0.9
37 34 0.5
38 49 0.7
39 44 0.6
40 35 0.5
41 27 0.4
42 10 0.1
43 21 0.3
44 16 0.2
45 21 0.3
46 8 0.1
47 6 0.1
48 8 0.1
49 6 0.1
50 2 0.0
51 2 0.0
52 4 0.1
53 1 0.0
54 4 0.1
55 2 0.0
57 1 0.0

International Women’s Day 2018: Women in the World Bank Data

The second instalment in this series of data visualisation posts for Women’s History Month 2018 looks at the World Bank World Development Indicators (WDI). This massive collection has data in several categories: demographic, education, work, poverty, health. It includes both country-level data and various aggregates by different criteria: geographical regions, income levels, etc. The UK Data Service has a useful guide as well as access to the data. You can also download it directly from the World Bank website (and it has an API which I haven’t tried), and there are tools like R packages.

A lot of the data is relevant to women’s and gender history, so much so that gender has its data portal. I’ve selected just a handful of significant indicators with the most comprehensive coverage (life expectancy, fertility, education), and I’ve done two series of graphs: the first uses the World Bank income level groupings, and the second takes a selection of six countries (chosen because they are varied geographically, culturally, in terms of income and in their data patterns, and because they have good data coverage, not for any kind of representativeness).

I’m sure there are no surprises here for people who study global development, but for me at least it’s been an educational experience. There seems to be quite a lot of good news for women in this data. The bad news is the sheer levels of inequality between income regions in many of the indicators.

Life expectancy for men and women

Life expectancy is one of the most long-running series in the data; most countries have it from 1960 onwards. This is a ‘faceted’ graph comparing female and male life expectancy at birth in the five income groups (and the world as a whole). The familiar observation that women live longer than men is not just a “Western” phenomenon, although it appears that the wealthier the country, the bigger the gap. The level of the continuing gap between the richest and poorest countries is one thing that has not much changed.

The second graph uses the same format to look at the six countries, and at country level there is more variation – even dips or periods of stagnation that counter the general upward trends.

(Oops. I forgot to make nice labels for the y axes. That should be “years from birth”.)

This graph shows the data for the six countries in a different way. I think it’s a bit less clear in some respects, but it’s useful for comparing the countries at particular times, and how their trajectories vary.

Fertility rates

Fertility rates are also widely available from 1960. Women everywhere are having fewer babies than they were 60 years ago, but the most rapid falls have been in middle income countries. Again, the six countries show more variation.

The education of girls

The World Bank’s earliest education data starts a bit later than the demographic indicators, from 1970 onwards. The most notable feature of this is the convergence across every income group at primary school level, and secondary education is not far behind except in the poorest countries. (Watch out for the difference in the scale of the y axis, and the perspective, of the two charts…)

There are a few more gaps appearing in the country data, especially in the secondary school data. Gaps can be frustrating of course, but they’re important for highlight questions I’d need to be asking about how the data was collected, and about the calculation of aggregates.

Primary education

Secondary education

Teachers

Mostly, employment data in the WDI doesn’t get going until the 1990s. But the education data does contain some information about the gender of primary school teachers. (One of the oddest gaps in the country data is that this isn’t recorded for Norway. The data for secondary school teachers is even bittier and I decided not to include it.)

I already knew that in Britain primary school teaching is a heavily female profession, but I didn’t quite realise how far this extends across high income countries generally. The global trend seems to be in the same direction, but much more slowly in the low income countries group.

Code and data on Github.

WHM18: Westminster Coroners’ Inquests 1760-99

For Women’s History Month 2018 I plan to make some visualisations of women’s/gender history datasets. I don’t know how often this will happen yet, but hope there’ll be at least one a week.

I’ll blog about them here and post code and data over at Github. The datavizes won’t necessarily be very sophisticated or original, but I’ll try to highlight different kinds of graph and how they can be used to explore patterns in data and start to formulate ideas and stories a historian might want to tell. Some of the datasets are likely to be closely related to my research interests and recent projects, but I hope to find new material from various sources and have some fun with stuff I don’t know much about!

There are likely to be two types of post:
* focusing on women
* comparison of women and men’s experiences

Today’s opening instalment is of the second kind. I’ve started with this one because it’s data I recently created and this is my first chance to do something with it!

***

Eighteenth-century inquests were usually held within a few days of a sudden, violent, accidental or unexplained death, at a local alehouse, parish workhouse or the location of the death itself. The dataset I’ve created contains a wealth of data that could be visualised, including locations, dates and verdicts. Here I’m just going to focus on a few comparisons of patterns for male and female deceased, using a range of graph types which, as you’ll see, highlight information in different ways. The dataset contains a total of 2894 inquests. A very small number were of more than one person/gender unknown or mixed; I’ve removed those (and for convenience I’m going to pretend that for the remaining 2891, 1 inquest = 1 person, though there might be a few exceptions). 361 were children.

The first graph shows annual counts of male and female inquests, showing clearly that there were considerably more inquests for male decedents than for female. It also shows that there was a lot of variation in the numbers of inquests from one year to the next.

This is also clearly shown by a proportional stacked chart, which also shows that the gender ratio doesn’t vary greatly from year to year, despite the fluctuating totals. This is… curious.

When we take a look at the inquests on children, it can be seen that the ratios appear to be rather more even but also less consistent (which might be expected with smaller numbers). I’d need to look more closely at the numbers with this one, but it’s clear that there is less gender disparity. Why?

(Charts like the last two are brilliant for showing proportions, but you have to bear in mind that they hide the variations in actual numbers.)

Here is another type of stacked chart. The first graph showed the counts for each gender independently; this stacks them on top of each other so you can see both the total numbers and the gender proportions.

Now I want to take a look at whether there are gendered patterns in inquest verdicts. I can break this down in two different ways. The first shows a breakdown of verdicts for each gender. The most obvious thing to note here is that accidental deaths account for a much higher proportion of the male deaths, which seems likely to reflect that men were more likely to work in a range of dangerous manual trades. Meanwhile, a higher proportion of women than men committed suicide, were victims of homicide, or died of natural causes. (Another time, I might add a third column showing the pattern for all verdicts.)

My final chart flips the breakdown around to compare the gender ratios for each type of verdict. This helps to give a different perspective on to the gender differences.

Here it is again, but I’ve added a guideline showing the average male:female ratio (which is 72.6% to 27.4%). These two charts emphasise that compared to men and boys, women and girls are a) least likely to die in an accident and b) most likely to be victims of homicide – with nearly 40% of homicide verdicts, this is the one category that’s approaching gender parity. Considering that lethal violence was largely a masculine domain (both as offenders and victims), this is interesting – though we shouldn’t forget that homicides account for a small proportion of inquests.

So, based on these explorations I have a number of lines of enquiry to start investigating in more depth, including:

  • why are men so much more likely to die in circumstances that lead to an inquest?
  • homicide victims
  • the gendering of accidental deaths (will it turn out to be the case, as you might speculate, that men’s accidental deaths are more likely to be outside the home and in the context of waged work while women’s are more domestic?)
  • the differences between children and adults (is there generally a less gendered profile for children?)
  • I’d also look more closely at the annual fluctuations, and probably at seasonal variations as well

Defendants’ voices and silences in the Old Bailey courtroom, 1781-1880

This is a version of the paper I gave at the Digital Panopticon launch conference at Liverpool in September 2017.

In the interests of fostering reproducible research in the humanities, I’ve put all the data and R code underlying this paper online on Github – details of where to find them are at the end.

Defendant speech and verdicts in the Old Bailey

Defendants’ voices are at the heart of the Digital Panopticon Voices of Authority research theme I’ve been working on with Tim Hitchcock. We know that defendants were speaking less in Old Bailey Online trials as the 19th century went on; we’ve tended to put this in the context of growing bureaucratisation and the rise of plea bargaining.

I want to think about it slightly differently in this paper though. The graph above compares conviction/acquittal for defendants who spoke and those who remained silent, in trials containing direct speech between 1781 and 1880. It suggests that for defendants themselves, their voices were a liability. This won’t surprise those who’ve read historians’ depiction of the plight that defendants found themselves in 18th-century courtrooms without defence lawyers, in the “Accused Speaks” model of the criminal trial (eg Langbein, Beattie).

But this isn’t a story of bureaucrats silencing defendants (or lawyers riding in to the rescue). I want to suggest that, once defendants had alternatives to speaking for themselves (ie, representation by lawyers and/or plea bargaining), they made the choice to fall silent because it was often in their best interests.

About the “Old Bailey Voices” Data

  • Brings together Old Bailey Online and Old Bailey Corpus (with some additional tagging, explained in more detail in the documentation on Github)
  • Combines linguistic tagging (direct speech, speaker roles) and structured trials tagging (including verdicts and sentences)
  • Single defendant trials only, 1781-1880
  • 20700 trials in 227 OBO sessions
  • 15850 of the trials contain first-person speech tagged by OBC

The Old Bailey Corpus, created by Magnus Huber, enhanced a large sample of the OBP 1720-1913 for linguistic analysis, including tagging of direct speech and tagging about speakers. [In total: 407 Proceedings, ca. 14 million spoken words, ca. 750,000 spoken words/decade.]

Trials with multiple defendants have been excluded from the dataset because of the added complexity of matching the right speaker to utterances (and they aren’t always individually named in any case). [But of course this begs the question of whether the dynamics and outcomes of multi-defendant trials might be different…]

Trial outcomes have also been simplified; if there are multiple verdicts or sentences only the most “serious” is retained. Also, for this paper I include only trials ending in guilty/not guilty verdicts, omitting a handful of ‘special verdicts’ etc.

Caveat!

Working assumption is that nearly all silent defendants do have a lawyer and the majority of defendants who speak, don’t.

Sometimes, especially in early decades, defendants had a lawyer and also spoke. Unfortunately, the OBC tagging doesn’t distinguish between prosecution and defence lawyers, and not all lawyer speech was actually reported.

But, more seriously, is it safe to assume that ‘silent’ defendants were really silent? Occasionally defendant speech was actually censored in the Proceedings (in trials where other speech was reported), eg a man on trial for seditious libel in 1822 whose defence “was of such a nature as to shock the ears of every person present, and is of course unfit for publication”. But that was a very unusual, political, case. (See t18220522-82 and Google Books, Trial of Humphrey Boyle)

[However, it was suggested in questions after the presentation that maybe the issue isn’t so much total censorship as in the case above, but that the words of convicted defendants might be more likely to be partially censored, which would problematise analyses that centre on extent and content of their words. This could be a particular problem in 1780s and 1790s; maybe less so later on.]

So work to be done here – eg, look at trials with alternative reports specifically to consider defendants’ words.

Distribution of trials by decade 1781-1880

Start with some broad context.

The number of cases peaked during the 1840s and dramatically fell in the 1850s. (Following the Criminal Justice Act 1855, many simple larceny cases were transferred to magistrates’ courts.)

Percentage of trials containing speech, annually

Percentage climbs from 1780s (in 1778 Proceedings became near-official record of court), peaks early 19th c and then after major criminal justice reforms of late 1820s swept away most of the Bloody Code, shown by red line, substantial fall in proportion of trials containing speech.

This was primarily due to increase in guilty pleas, which were previously rare. After the reforms, 2/3 of trials without speech are guilty pleas.

Conviction rates annually, including guilty pleas

(Ignore the spike around 1792, due to censorship of acquittals.) Gradual increase in conviction rates which declines again after mid 19th c.

But if we exclude guilty pleas and look only at jury trials, the pattern is rather different.

Conviction rates annually, excluding guilty pleas

Conviction rates in jury trials after the 1820s rapidly decrease – not much over 60% by end of 1870s. That’s much closer to 18th-century conviction rates (when nearly all defendants pleaded not guilty), in spite of all the transformations inside and outside the courtroom in between.

Percentage of trials in which the defendant speaks, annually

Here the green line is the Prisoners’ Counsel Act of 1836, which afforded all prisoners the right to full legal representation. But the smoothed trend line indicates that it had no significant impact on defendant speech. Defendants had, at the judge’s discretion, been permitted defence counsel to examine and cross-examine witnesses since the 1730s.Legal historians emphasise the transformative effect of the Act; but from defendants’ point of view it seems less important; for them it was already a done deal and the Bloody Code reforms were much more significant.

Defendant speech/silence and verdicts, by decade

This breaks down the first graph by decade – shows that the general pattern is consistent throughout period, though exact % and proportions do vary.

Defendant speech/silence/guilty pleas and sentences

Moreover, harsher outcomes for defendants who speak continues into sentencing. Pleading guilty (though bear in mind this only really applies to c.1830-1880, whereas silent/speaks bars are for whole period) most likely to result in imprisonment, much less likely to receive transportation (and hardly ever death) sentence. Defendants who speak are the most likely to face tougher sentences – death or transportation, more so than the silent.

(Don’t yet have actual punishments – the next big job is getting the linked Digital Panopticon life archives…)

Defendant word counts (all words spoken in a trial)

How much did defendants say? Not a lot. The largest single group of defendants is the silent (ie, 0 words). But even those who spoke usually didn’t say very much. [average overall was 55 words] Eloquent, articulate defendants few and far between!

Defendant word counts and verdicts

So if you did speak, it was better to say plenty!? Or in other words, more articulate defendants had a better chance of acquittal (though they were still slightly worse off than the silent).

Defences: average word counts and verdicts

Finish with focus on defendants’ defence statements – made by nearly all defendants who spoke and for the majority the only thing they did say (a minority questioned witnesses or made statements at other points in the trial).

overall word counts of defence statements * guilty (n=7696) average wc 44.97 * notguilty (n=1414) average wc 65.15

On average, defence statements by the acquitted were longer. Again highlights that more articulate defendants do better.

Also, there is more variety (less repetition) in the statements of acquitted defendants. 98% (1374) of their 1414 defence statements are unique (crudely measured, as text strings). Whereas 93.17% (7170) of statements by convicted defendants are unique.

Start to look more closely at what they say? Not possible yet to investigate in depth, but use some simple linguistic measures.

Defences: Words least associated with acquittal

mercy
picked
man
i
distress
carry
along
them
beg
stop
up
young

In linguistics, keywords are “items of unusual frequency in comparison with a reference corpus”. Compared the larger set of defence statements by defendants who were convicted with defence statements by defendants who were acquitted

Table above is the words least likely to be associated with acquittal – ie, the least successful defence statements…

I want to highlight:

  • mercy + beg
  • picked (+ carry might be related)
  • i
  • distress

Remember that many defence statements were not really ‘defences’; they were more of an appeal to the judges’ clemency after sentencing (‘I beg for mercy’) or claiming extenuating circumstances (‘I was in distress’) in particular. Also Playing down offence – ‘I picked up the things’.

And in general many short bare statements beginning with “I” rather than more complex narratives.

Four hopeless short defences

So I picked four of the most frequent short (non-)defences that are heavily associated with convictions, to explore a bit further. (excludes use of any of these within longer defences)

defence frequency % convicted
nothing to say 109 98.17
mercy 125 98.40
picked up/found 223 93.72
distress 82 97.56

Main variants:

  • I have nothing to say
  • I beg for mercy/leave it to the mercy of the court/throw myself on the mercy of the court
  • I picked it (them) up/found it
  • I was in (great) distress/I was distressed/I did it through distress

The next four graphs show the percentage of defendant speakers who use each phrase in short defence statements in each decade.

I have nothing to say

This was very popular before 1810s – peaks at use by 4% of defendants who speak in decade 1801-10 and then rapidly disappears.

I beg for mercy/leave to the mercy of the court

Slightly later popularity – slower decline after 1810s

I picked it up/found it

Less dramatic decline after 1820s.

I was in distress/did it through distress

Curious that this doesn’t appear at all in 1780s; peaks 1810s.

Conclusions

So there are variations in timing/speed of decline, but broadly, these hopeless ’non-’defence statements, which are almost certain to be followed by conviction, are all declining in use and rarely heard in the courtroom after the 1820s. That fits, it seems to me, with both the gradual decline in defendant speech and the more rapid rise from the late 1820s of plea bargaining.

First, the defence lawyer option meant that defendants were better off finding the money for a lawyer who could try to undermine the prosecution case through aggressively examining witnesses. This was happening from the 1780s onwards.

And second, the plea bargaining option from the late 1820s meant that if defendants really had no viable defence, had been caught red-handed, they were better off pleading guilty in return for a less harsh punishment.

And so: for defendants who wanted to walk free or at least lessen their punishment, if not for later historians trying to hear their voices and understand what made them tick, silence was golden.

More stuff

Settlement and Removal: Poor Relief and Exclusion in 18th-century London

From the Act for the Relief of the Poor of 1662, or so-called “Settlement Act” onwards, various pieces of 17th- and 18th- century legislation formally codified entitlement to parochial poor relief by “settlement“. The main ways of gaining a settlement of your own were: completing a formally contracted apprenticeship; at least one year in continuous service; renting a house worth at least £10 a year; paying parish taxes or serving as a parish officer. Many people’s settlements, however, were ‘derived’: a married woman from her husband; children born in wedlock from their parents. But illegitimate children got their settlement from their place of birth. And a new settlement erased previous ones.

In theory, everyone in England and Wales in the 18th century ‘belonged’ to a parish, somewhere. Which was fine… as long as you had a settlement in a place where you actually wanted to be. But the flip side of settlement was removal: exclusion was key to the workings of a locally-based poor relief policy.

The case study: St Clement Danes

This paper is early work in progress based on sources digitised by the London Lives project, exploring the narratives of the poor in examinations and petitions and linking together records to trace larger patterns. The focus here on one London Lives parish in the second half of the 18th century, St Clement Danes, a large urban parish to the west of the City of London, with a population of around 13,000 in 1801, of whom about 600 were receiving poor relief (costing the parish about £7000 p.a.). It was fairly well off, on average, with varied local trades and industries. But, as is often the case, averages hide considerable variation, with poor and rich living close together.

Overview of the data

I’m focusing on three sources from London Lives:

1) a dataset of settlement, bastardy and vagrancy examinations for St Clement Danes and another London Lives parish, St Botolph Aldgate, covering 1739-1800 (containing about 11000 exams in total). There were three main possible outcomes of a settlement examination: the examined was shown to have a settlement in the parish; their settlement was somewhere else but they could produce a settlement certificate guaranteeing that their own parish would relieve them, so they were allowed to stay; removal from the parish.

2) the second dataset is a Clement Danes register of removal orders (covering late 1752 to mid 1793; I chopped off the part-years at beginning and end for a convenient 40 year period).

Many archives have large numbers of surviving 18th-century pauper examinations, but records of removals are much less common. Linking the two means I can begin to examine more systematically the outcomes of examinations. For the period 1753-92, then, there are 5046 examinations and 2479 orders, of which 2357 could be linked to at least one exam. (Conversely, 2365 exams could be linked to at least one removal order.)

3) And finally, I’ve linked these to petitions in the Sessions papers from parishes appealing removals from Clement Danes.

1: annual counts of examinations 1753-92, broken down by type of exam

Figure 1 shows the three types of examinations: settlement, bastardy and vagrancy. The vast majority of exams in this series were settlement exams (green); bastardy exams (red) account for about 10% of the total. The vagrancy category is a very thin blue line at the bottom of just a few years; the numbers were tiny. There will have been more vagrancy exams than this, but they were usually recorded separately, often on pre-printed forms (which makes me a bit curious about the few that do turn up in this series – why are they here at all?).

2: annual counts of removal orders linked to exams, 1753-92, by type of order

In Figure 2 we can see there are two types of removal order in the register: non-specific orders I’ll simply call ‘pauper removals’ and vagrant removals (sometimes called passes, as the removed were “passed” to their destinations). Most of the orders that couldn’t be linked to examinations were vagrant removals, again indicating that vagrant examinations were recorded separately. But this graph shows that a striking proportion of settlement exams ultimately resulted in vagrant removal orders, highlighting the fuzzy boundaries between the poor laws and vagrancy laws.

Generally, parishes had an incentive to do this because they had to foot the bill for pauper removals, while the county paid for vagrants to be removed. It looks suspiciously as though this was getting rather out of hand in the mid 1750s. In the spring of 1757 the bench at Middlesex Sessions was very concerned about the numbers of vagrants and costs of removals. In July they appointed a new contractor to handle the removal of vagrants and (in what looks to me at least very much like a slap on the wrist to negligent JPs) ordered that JPs were “not to sign any Vagrant Pass” without proof “that an Act of Vagrancy hath been committed”. There was a dramatic and immediate impact in St Clement Danes: of 75 vagrant removals in 1757, only 7 were dated after July.

Even so, vagrant removals continue to be quite conspicuous in comparison to examinations; so I want to look more closely at the settlement exams that wound up in vagrant removals to see if there’s any real justification beyond financial expediency. (The smaller increases in vagrant removals after 1757 do generally match years which have vagrant examinations, but they only partially correlate to the years with the largest numbers of exams.)

Overall, about 50% of settlement examinations led to removal orders. The main concern of bastardy exams was establishing paternity rather than settlement (though occasionally a single exam covers both topics), and so there are much lower linkage rates between these and the orders (about 20 of 500+ could be linked to orders). However, the women examined in bastardy exams often have later settlement exams as well, and so I have some more linkage work to do to establish whether more than the 20 were actually removed at some point.

3: gender in settlement/vagrancy exams, 1752-93

As in many other studies (and even after ignoring bastardy exams), by far the majority of examinants and the removed were women. They averaged 75% of the examined over the period, reflecting women’s vulnerability to poverty. However, women were slightly less likely to be removed than the male examinants, and it’s possible that there’s some correlation between the peaks in exams/removals and higher rates of male removal. The differences are not large; but I have some more number-crunching to do here.

Contesting exclusion

I want to look now at cases in which examinants returned to the parish after being removed by an order. Between 1753 and 1792, at least 122 examinants were removed more than once. This happened in two ways: first, examinants were returned after the receiving parish disputed the case; second, the examinant themselves might reject the magistrates’ authority and return of their own volition. [Links to documents for each of the cases mentioned are listed at the end of the post.]

The returned

Parishes to which paupers were removed had a right of appeal to Quarter Sessions (or to the Court of Aldermen in the City of London). Between 1750 and 1800, there are about 2300 petitions of this kind in the Middlesex, Westminster and City of London Sessions papers in London Lives. Individual parishes did not appeal many removals: the process was very expensive; and it’s argued that in London many parishes had informal agreements to accept paupers from each other (‘friendly passes’).

I’ve found 44 appeals against removals from Clement Danes between 1753-1792,  all of which can be linked to examinations and/or removal orders. The petitions themselves are usually uninformative about the reasons for appeal (unlike many other petitions in the Sessions Papers). But the linked examinations can be more revealing.

In 1760, the parish of St Brides appealed against Clement Danes sending them an 8 year old girl, Mary Ives. Mary’s mother was dead and she’d been abandoned by her father James, whose settlement was unknown. Mary had been born in St Brides; but she was legitimate so that was irrelevant to her settlement. So it’s not surprising that their appeal was upheld and Clement Danes had to take Mary back.[1]

Or, in 1762, St Sepulchre’s appealed CD’s decision to send them Susanna Flood, the widow of Noah Flood, and their three children. According to the settlement examinations, Noah had only served 5 years of his Apprenticeship in St. Sepulchre and the final two years with a different master in Hornsey. Again, the appeal was successful. Within three months Susanna and her children had been dispatched to Hornsey instead.[2]

On the facts of the exam, sending Mary Ives to St Brides seems simply opportunistic; the JPs must have known perfectly well that there was nothing in the examination to support this course of action. The most charitable interpretation is that they had some reason to believe her father might turn out to have a settlement in St Sepulchre, and so the recipients would not bother to appeal. Equally, I’m sceptical that the right course of action in a case like Noah Flood’s (though clearly not entirely straightforward) wasn’t well established and known to JPs by the 1760s. Both cases seem to suggest that getting rid of unwanted paupers as quickly as possible could take priority over establishing the facts of uncertain cases. And yet, if that were really the case, we might expect appeals to be rather more frequent than they actually were.

[Oops: on reading the Flood examinations again, I looked more carefully at the dates, and realised I castigated the JPs unfairly: Susanna’s first examination only mentioned the St Sepulchre apprenticeship and it wasn’t until she was examined again after the appeal that she completed the narrative.]

But one more caveat. Of the 44 linked petitions, 15 (14%) were in just one year, 1785. The mid- to late-1780s were busy years for examinations and removals in Clement Danes. The 1785 Sessions Papers are unusually full of parish petitions – but so are those for 1784, and that year’s files contain no appeals against Clement Danes at all. What is going on?! Survival rates of documents, including petitions, in the Sessions Papers are variable and uncertain, but this is a very curious anomaly.

The returners

In all this, the interests and desires of the paupers themselves are clearly the lowest priority of all. (As evidenced by the way in which parish officials were apparently quite happy to label significant numbers of them vagrants – a criminal offence, remember – in the 1750s, simply to save some money on removal costs.) But we can, sometimes, begin to trace something of what the poor wanted for themselves.

Some examinants gave accounts that investigation rapidly proved to be false. Thomas White’s claim in 1769 to have a settlement in CD based on 2 1/2 years service was “On Enquiry found… to be false [, the master] never having kept house a Twelvmonth in the Parish & the Examinant only an Earrand Boy for a little time”. If Thomas lied because he didn’t want to leave the parish, the tactic may have worked: there’s no sign of a removal order.[3]

Challenging the authority of the magistrates and law by returning after a removal order was a risky business; returners could be labelled as vagrants and subject to the harsher penalties of the vagrancy laws. Nonetheless, some returned several times over years or even decades.

Ann Brown, a single woman aged around 40 in 1755, had been a servant to a Mr Champ in Oxford for about 18 months during the mid 1740s. There was no doubt about her settlement: she gave almost exactly the same account to the CD magistrates four times between 1751 and 1757. The first occasion pre-dates the removals register but on each of the subsequent times they sent her back to Oxford as a vagrant. An order in 1755 describes her as “an incorrigible rogue”, which had a specific meaning in the vagrancy laws: it referred to repeat offenders who could be more harshly punished, from imprisonment with hard labour potentially up to transportation to the colonies. In practice this was rare, but Ann would surely have been warned it could happen. And yet she came back again two years later. And while most repeat returners came quite short distances from other London parishes, each time she had to cover a 50 mile journey from Oxford.[4]

On her first examination in April 1758, Mary Jenkins appears to be just one of the many women who were examined about their settlement because their husbands had recently died, gone away to military service, been imprisoned, or simply deserted them. Her husband Henry was at sea and they had 3 young sons. After their marriage, Henry had rented a house in St Olave Southwark at an annual rent of 11 guineas, so the CD JPs had Mary and her young sons removed there: a straightforward case. But Mary returned to CD four times, only to be removed again. Again her motives are unknown.[5]

In these cases it seems to me there must be some connection to the parish that would not be documented in settlement examinations, but whether I can trace records that might shed light on them, I don’t know. Tantalisingly, an Ann Brown was baptised in Clement Danes in 1713; unfortunately, it’s a common sort of name and there were quite a few Ann Browns born in Middlesex (let alone anywhere else) in a reasonable date range. Conversely, in what I’m fairly sure is Henry Jenkins’ and Mary’s marriage record, her maiden name is transcribed as “Rouffinee”, an apparently unique surname (this could be either a transcription error or an unusual spelling of, perhaps, an Irish name like Roughneen?).

Irregular unions and family breakup

And I want to close with a case highlighting the themes running through many examinations of marriage breakdown, ‘irregular’ unions and their implications, the potential for paupers to be excluded not only from parishes but from their own families.

Ann Threader was examined in February 1785. She had married John Threader ‘about 30 years ago’ at the Fleet (I think actually in 1750), and he deserted her just two months afterwards. She had never seen him again but had heard that he re-married, and that he had later died. A few years after he left her, she moved in with Jacob Wesley, a shoemaker, with whom she had three illegitimate children, aged between 9 and 14 at the time of the exam. Because Ann and Jacob had moved house during their relationship, their children had been born in two different parishes in Southwark. CD attempted to remove the children to those parishes, but both removals were successfully appealed at the next Middlesex Sessions.

This time the examination itself sheds no light on the grounds for appeal, but it has a marginal note that CD were ‘obliged’ to take the children because they had already ‘been passed to us some time back’. Whatever the reason, CD subsequently relieved the children, although they quickly had the two older children bound out as apprentices, and they also gave Ann occasional out-relief for some years.[6]

Following the failure of a marriage or long-term absence of a husband, cohabitation and (less often) bigamous re-marriage were both options to be found in settlement exams, and I want to explore this in more depth in the future. Just as with ‘regular’ marriages, the break-up of an ‘irregular’ union due to a partner’s death or departure could make the remaining family members vulnerable to exclusion. But with these unions, the settlement laws could  in theory result in the break up of an entire family: the illegitimate children to the parishes of their birth, the mother and father to separate parishes altogether.

Future directions

Because this research is in early stages I don’t have substantial conclusions yet, so instead a few thoughts on future directions.

The first strand relates to the experiences of the poor themselves, and how settlement strategies could go awry. People – perhaps especially poor people! – didn’t always live the well-ordered lives imagined by settlement law and there were many potential sources of dispute. Young people might not complete apprenticeships or service, for a range of reasons. (Apprenticeships were long and might well start in one parish and finish in another because of a master’s house move, death, bankruptcy or abuse of apprentices.) In any case, young adults didn’t always stay put after gaining a settlement of their own; they might move to find work, or return to their childhood homes, but never manage to gain another settlement. Elderly widows or young orphans could end up being sent to parishes they had never even visited because their husbands or parents had worked or lived there many decades earlier. Young people could be separated from the rest of their family because they had been born before their parents’ marriage. I want to explore these experiences in more depth, and those of the poor who resisted exclusion.

Second, there’s the larger context of poor law and settlement practice. Ann Winter and Thijs Lambrecht have recently argued for the importance of investigating local experiences and variations in settlement practice, and I think Jeremy Boulton has brilliantly shown the value of detailed record linkage in a local case study, for St Martin in the Fields. In the late 18th century, Clement Danes had a reputation as a parish where migrants could go to claim poor relief without too much scrutiny by parish officials – a “casualty parish” (indeed, the best casualty parish!). I’m curious, among other things, how accurate that image was. (One thing I do know already is that Clement Danes removal rates were considerably higher than those in St Martins a few decades earlier in the century. The numbers of examinations in Clement Danes are also much higher than those in the St Botolph Aldgate records, though they had roughly similar size populations.) How consistent was practice in Clement Danes, and how did it match up to settlement law? In reality, how likely were widows or abandoned wives or illegitimate children likely to be despatched to far-off parishes? And how does it compare to other London parishes?

This is a slightly revised version of a paper delivered at Cultures of Exclusion in the Early Modern World, University of Warwick, May 2017.

London Lives documents

[1] Mary Ives

[2] Susanna Flood and family

[3] Thomas White

[4] Ann Brown

[5] Mary Jenkins and family

[6] Ann Threader and family

  • Examination, 1785
  • Removal orders
  • Petitions: St Georges Southwark and St Mary Overys
  • Apprentice register for Thomas and Hannah Threader:
  • Example of relief to family members (in the form of clothing)
  • (FindMyPast/FamilySearch) marriage of John Thredder to Ann Clark, 28 Feb 1750, London; FamilySearch also has a Fleet marriage record for John Thredder on the same date. Despite the date discrepancy, the match seems likely (35 years is a long time…)
  • (FMP/FamilySearch) marriage of John Thredder to Mary Poore, St Martin in the Fields, April 1763; burial of John Threader, St Martin in the Fields, 20 Jan 1772. But there is also a burial record for a John Threader in 1764 at St Ann Soho, so can’t be certain that the St Martins records are the right man.

Additional reading

London Lives: Poor Law

Zotero bibliography (work in progress)

Noted in particular:

  • Jeremy Boulton, “Double Deterrence: Settlement and Practice in London’s West End, 1725-1824”, in Migration, Settlement and Belonging in Europe, 1500–1930s: Comparative Perspectives, edited by Anne Winter and Steve King, 54–80. New York: Berghahn, 2013.
  • Norma Landau, “The Laws of Settlement and the Surveillance of Immigration in Eighteenth-Century Kent.” Continuity and Change 3, no. 03 (December 1988): 391. doi:10.1017/S026841600000429X.
  • A. Winter and T. Lambrecht, “Migration, Poor Relief and Local Autonomy: Settlement Policies in England and the Southern Low Countries in the Eighteenth Century.” Past & Present 218, no. 1 (February 1, 2013): 91–126. doi:10.1093/pastj/gts021.

Further information

London Lives Paupers and Petitioners project

All the data for this paper is shared under Creative Commons licences and can be downloaded:

(The removal orders dataset includes the data for the linkage between the three sources as used in this paper.)

What can you do with 10,000 petitions? Digging deeper into the data

The London Lives Petitions project is exploring approximately 10,000 petitions (and petitioning letters) addressed to magistrates which survive in the voluminous records of eighteenth-century London and Middlesex Sessions of the Peace which were digitised around 2008 by the London Lives project (of which I was the project manager). These documents have been difficult to access within the existing London Lives online resource because of the sheer size and variety of the Sessions Papers documents. So, the first few months of the project focused on the challenge of discovering and identifying petitions in the Sessions Papers; the resulting data, consisting of structured metadata and plain text files, has been released as open data under a Creative Commons licence. (The bulk of this effort is complete, but work is ongoing to improve the data where possible.) The data and documentation of the process can be found here.

Moving on to analysis of this new data, I’m starting from the question: What can you do with 10,000 petitions? Can large-scale ‘distant reading’ techniques tell us things that we didn’t already know from close reading of smaller, personally-crafted collections of petitions? I’m experimenting with various methods and data visualisations. But I also need to consider: what can you not do with them? Understanding what doesn’t work for data like this will be important. For one thing, the quality of the transcriptions does not match up to traditional scholarly standards: is it good enough for data mining? (This and other limitations of the original data are documented on London Lives.) With this in mind, I’ve so far done a number of mostly boring but useful things:

  1. Processing with VARD2, a tool “designed to assist users of historical corpora in dealing with spelling variation”. This has not been intended to produce ‘better’ transcriptions (and it has probably introduced some errors along the way), but it has been very useful for dealing with common variants (eg “peticon”) and creating cleaner texts for analysis.
  2.  Identifying and removing marginal annotations and other additions that were not part of the main body of the petition texts, and some purely formula elements (like “Middx SS” at the beginning of many documents).
  3. Breaking petitions up into their structural elements (which was important for my last post).

Additionally, as I’ve discussed in an earlier blog post, the survival of petitions (like other documents in the Sessions Papers) “could be haphazard and dependent on the preferences of individual clerks”. What is actually being counted? So, it’s necessary to put the petitions in their archival context. The Sessions Papers were loose papers relating to the work of the Sessions of the Peace (and Old Bailey from 1755), which could include petitions, examinations (from criminal, settlement or bastardy cases), calendars of prisoners and recognizances, copies of orders, lists of vagrants, coroners’ records (before they were split off into separate archives) and much besides.

London Lives Sessions Papers: image counts per year 1690-1799
London Lives Sessions Papers: image counts per year 1690-1799

The first chart above simply shows counts of the page images in the London Lives Sessions Papers, highlighting the very uneven survival of the records, especially the nine years from 1738 when very few files have survived, and many of those which did make it contain relatively few documents (or were not fit for filming). In spite of the fluctuations, however, it also indicates quite clearly the expansion of the courts’ business, especially the Middlesex Sessions (in blue), in the second half of the 18th century. (The Old Bailey series will be excluded from further analysis because it contains so few petitions.)

But while the Sessions Papers indicate ever growing business, petitions are on the decline (see also). This doesn’t necessarily mean there were fewer petitioners; it’s also possible that their petitions were less likely to be retained for long when there was so much more paper to deal with.

Petitions in Sessions Papers 1690-1799

There is also a discernible shift in certain characteristics of the petitions themselves. A large group of petitions came from parish officials concerned primarily with the administration of the poor laws – churchwardens and overseers of the poor – and I’ve been working to identify and separate these ‘parish’ petitions from petitions by individuals (and a few other institutions). Most of them relate to disputed pauper removals; smaller numbers are about poor rates assessments, negligent officials, or highway repairs. Before c.1720 these constitute no more than one-third of all petitions (in most years), but from c.1760 the figure is around two-thirds, and it’s clear that ‘other’ petitions account for most of the decline in numbers. In total, the parish petitions account for about 4600 petitions (c.46% of the total), of which about 4400 are concerned specifically with removed paupers.

parish petitions v others 1690-1799
parish petitions concerning disputed pauper removals, 1690-1799

Here’s a typical example of one of these petitions (1760), carefully legalistic (usually drawn up by a solicitor, with careful reference to the procedures of the laws of settlement):

The Humble petition and Appeal of the Churchwardens and Overseers of the poor of the parish of Bushton in the County of Northampton Sheweth That by Virtue of an Order under the Hands and Seals of… two of his Majestys Justices of the peace for the County of Middx… Alice Wilkinson (in the sd Order) called wife of Matthew Wilkinson (if Living) was removed and conveyed from the parish of St. Clement Danes in the said County of Middx to the said parish of Rushton as the place of the last Legal Settlement of the said Alice Wilkinson Your Petitioners Conceiving themselves aggrieved by the said Order of the said two Justices of the Peace humbly Appeal to this Court against the same…

This was the petition as the voice of early modern bureaucracy rather than ‘the people’. Comparison of average word counts for parish (pink bubbles) vs other petitions (blue) also points to the former’s highly standardised character. Overall, parish petitions are only slightly shorter than the rest, but they contain far fewer unique words.

parish petitions v other, average word counts
parish petitions v other, average word counts

Total and unique word counts (using Antconc):

parish other
petitions 4618 5406
unique words 12385 30923
total words 1025567 1233165
average per petition 222 228

Does a comparison of the most common words in the parish and other petitions offer any insights?

Wordle of top 100 words in parish petitions*
Wordle of top 100 words in parish petitions*
Wordle of top 100 words in other petitions*
Wordle of top 100 words in other petitions*

Word clouds may be considered harmful by some, but I think that the contrasting appearance of the two word clouds visually enhances the more prosaic table rather well: the parish petitions use a smaller range of unique words, so the top 100 are relatively evenly sized and spaced compared to the ‘other’ petitions which are dominated by a tiny number of formula words after which frequency tails off much more quickly. [*note: a small number of very common words – eg ‘a’, ‘the’, ‘for’ – have been removed from the wordle data.]

Where next? I want to start exploring that diversity more closely. I’ll be experimenting further with corpus linguistics tools and with topic modelling. And you might have noticed the bubble chart comparing parish and other petitions suggests that non-parish petitions were not simply becoming fewer in number but also substantially longer as the 18th century went on. Might this suggest that it’s primarily petitioners of lower social status who are gradually disappearing over the course of the century, leaving primarily institutions (which generated relatively short, standard petitions) and higher status individuals (creating longer, more elaborate ones)? Whatever the answer, it’s clear that tracing changes in the petitions’ language and subjects is something that I need to be investigating further.