Cleaned-up and slightly extended version of a paper presented at the conference Gender and Violence in the Early Modern World (University of Cambridge, 23 November 2019).
She had had unjust warrants against them, claiming to be afraid of “bodily harm”. This was “greatly astonishing” to the petitioners, who were “well known never to have disturbed her majesties peace” or threatened Anne herself.
Anne had come to Allys’s house early one morning and sneakily “convaye[d] her selffe into the house to doe some outrage upon” Allys, and finding her alone,
did assault and treade her the sayd Allys (beinge an aged woman) under feete and would her have murdred or otherwayes fouly intreated yf she hadd not bine prevented by [Margery] whoe hearinge the crye came imediatly…
This was “a matter soe shamfull and unnaturall, as the lyke by anie woman hath seeldome bine offred in anie [christian?] cuntrey or towne”. Further, Anne was a frequent disturber of the peace, causing many “unseemly” brawls and affrays, and upsetting the “best sort” of the town’s inhabitants.
As a result, Allys could “not be at peace within her owne house” and was “much affrayd” of further attacks; and so they prayed both to be released from Anne’s warrants against them and for the authorities to take action against Anne.
Some elements of the case are really unusual: the language – “shamfull and unnaturall… the lyke by anie woman hath seeldome bine offred” – as well as their demand for the magistrates to “brydle the outragousnesse of the sayd Anne Lingard”. There’s nothing quite like this in any of the other petitions.
Nonetheless it reflects a number of common themes in petition narratives by victims of violence:
a background context which includes malice and vexatious litigation, disordered behaviour (versus the quiet law-abiding victim);
at least one central, murderous, assault on weak, defenceless victims;
fear of further attacks and therefore the urgent importance of bringing the offender under control.
An extended version of my paper for the April 2019 workshop held by the AHRC Research Network on Petitions and Petitioning from the Medieval Period to the Present, on the theme Petitioning in Context: when and why do petitions matter?
The paper uses data from the London Lives Petitions Project to explore the decline in female petitioning and rise in petitions from institutions in 18th-century London.
This dataset makes accessible the uniquely comprehensive records of vagrant removal from, through, and back to Middlesex, encompassing the details of some 14,789 removals (either forcibly or voluntarily) of people as vagrants between 1777 and 1786. It includes people ejected from London as vagrants, and those sent back to London from counties beyond
They’ve already written about this data in an excellent article (open access) and Crymble has blogged further about his ongoing research. (They have better visualisations too, so you could skip this post entirely and go to the real thing. Think of this as a taster.)
I want to focus on ways of visualising multiple categories of qualitative information – the more categories you want to compare at the same time, the more complex a dataviz has to be. In this case, I’ve got four categories to play with: gender, dates, countries of origin, and vagrant ‘types’. That’s to say, there are three types of individual in the dataset: leaders of family groups, their dependents, and single vagrants. The gender of the majority of dependents is unknown (most are children), so for most of this post, I decided to simplify things by filtering out all of the dependents to focus on the group leaders and singles. (As a result, because I’m ignoring about 500 wives who were counted as dependents, the following will differ somewhat from the work referenced above.) This resulted in 10963 individuals.
Overall, the gender ratio of the vagrants looks almost perfectly balanced (5438 female to 5525 male). But this hides some interesting variations.
Figure 1: Bar chart comparing the numbers of male and female lead/single vagrants.
Firstly let’s break it down by the year of the case. (There are some missing records, and the very small numbers in 1777 and 1779 in particular are due to these gaps.) Two things stand out: the numbers of both female and male vagrants rise rapidly in the mid-1780s; and women are in the majority each year until 1782, after which they’re overtaken by men.
Figure 2: Bar chart showing numbers of male and female lead/single vagrants in each year.
Now looking at vagrant type. As soon as you have multiple categories, you can split up the data in different ways – the “best” can depend on the data and exactly what it is you want to show. So graph 3a compares the percentages of male and female vagrants for each vagrant type, whereas graph 3b shows the percentages of group and single for each gender. 3b highlights that the majority were single individuals – something you wouldn’t know at all from 3a. It also makes it clear that vagrant type was gendered – considerably more men than women were singles. 3a, on the other hand, is better if you want to know exactly what the proportions of men and women were in each type. Most often, if I had to pick just one of these, it’s likely that I’d plump for 3b, because I’ve already seen that overall there are very similar numbers of men and women. But it might be a harder choice if that weren’t the case.
Figure 3a: Stacked bars comparing the proportions of men and women for each vagrant type.Figure 3b: Stacked bars comparing the proportions of vagrant types for each gender
Now, looking at country of origin (British and Irish vagrants only, as there were only a few from other countries ), further striking differences emerge. It’s hardly surprising that the majority of the vagrants came from England, but much more noteworthy that there was such a large disparity between Irish men and women.
Figure 4: Stacked bars comparing origin countries for men and women.
Adam Crymble discusses what’s most likely going on, and it ties in with the particularly rapid increase in the numbers of male vagrants from 1783 shown in graph 1 – it’s probably the result of demobilisation after the American wars.
This says ‘demobilisation’ to me, and the male nature of most Irish vagrants suggests that this may have been a strategy for getting home after the war. Demobilisation was heavily centralized in London. Soldiers and sailors weren’t taken home; they were dropped off and left to find their own way.
Finally, I want to visualise the relationships between three categories in the data: gender, country and vagrant type. Mosaic plots are a more complex and less commonly used type of visualisation that can cram a lot more information into a single chart than you can with a bar chart. But, as with boxplots, that makes them a bit harder to interpret.
Figure 6: Mosaic plot of lead/single vagrants’ gender, country and type
Imagine that you start with a single large rectangular block. For your first category, you divide it horizontally, and put the labels for each “level” (in this case there are two, F and M, for gender) on the left hand Y axis. As in the very first bar chart, we can see that the proportions of men and women are close to equal.
Then you sub-divide the two blocks vertically for your second category (country) and put the labels along the top X axis. So reading left to right along each gender block, the first vertical block = English, the second = Irish, third = Scottish and fourth = Welsh. Again, we can see that English vagrants are in the majority for both genders, and at the same time, how a much higher proportion of the men are Irish.
Finally you sub-divide the blocks once again, horizontally, for the third category (vagrant type), and the labels for these (group and single) go on the right hand Y axis. The biggest single category, then, is women from England who are single (Hitchcock et al argue the importance of short-distance female migration London to find domestic service for making up much of this). The smallest category is men from Wales who lead a group.
Male Irish and Welsh vagrants are more likely to be single than are men from England and Scotland, whereas a higher proportion of Irish and (even more so) Scottish women were heading groups. (Crymble has also emphasised how different the Irish and Scottish vagrants were.)
The use of colour and shading adds one final dimension, but it’s harder to interpret on first sight. The idea is to show statistical significance. What it boils down to is that blue means the square is bigger than would be expected by the statistical model; red means it’s smaller than the model would expect (and the darker the colour, the bigger the significance). The fact that the group-Irish-male box is coloured dark red (ie, smaller than “expected”) pretty much seems to reinforce what we’ve already observed. The group-Scottish-female box also stands out among the smaller blocks – suggesting that this is significant and might be further investigated.
However, it’s important to to understand whether what the statistical model “expects” is appropriate for the data we have. In medical research, where data collection is conducted according to carefully defined rules, it may be possible to be confident that a statistical significance means a “real” difference. For a historian it might simply be pointing to imperfections in the data! So it’s essential for historians doing data analysis and visualisation to get to grips with both the original sources and the statistics. I’m still grappling with the second part…
The petition of Geelien Cowley ‘a poore widdow and mother of three smale fatherlesse children’:
that your petitioners late husband by name E[dward] Birien of Ruthin a souldier that served in his majestys service in Ireland neare upon three yeares & afterward he retorned to England he served in his majestys service there sixe or seaven yeares where in all these tymes he suffered many ympriso[nments] wounds & brueses wch made him unable to earn his liveliehoode & more especiallie this two yeares last past then he was allowed one of the majestys pensioners to receave a share of his majestys allo[wance] for maymed souldiers provided. Nowe may it please [your] worships to be advertised that the said Edward Birien your petitioners late husband, had a longe sicknesse, beeinge vearie poore & nowe called to gods mercie caused your petitioner to goe upon the credit with her neighbours to suplie her said husbands wants in confidence to receave his share & alloweance of pension as afore is set forth, but it was gods will to take hime to his mercie afore this generall sessions.
Most humbly prayeinge your worships to allowe your petitioner the pencion allotted her late husband for to paye to her creditors what she is engaged for & your worships further help & succours in such sort as your worships thinke meete without your worships comisseracion hearein your petitioner shall not be able to goe amonge good & charitable people for releefe to her & her smale children for feare of arrest or lawsuite. this I humblie bege for gods sacke…
The treasurer of the maimed soldiers’ fund was ordered to pay her the whole quarterly allowance due to her husband.
[NLW Chirk Castle Quarter Sessions files October 1665 B21/d7]
I’ve recently been working on the Digital Panopticon, a digital history project that has brought together (and created) massive amounts of data about British prisoners and convicts in the long 19th century, including several datasets which include heights for women. Adult height is strongly influenced by environmental factors in childhood, one of the most important being nutrition. So,
The height of past populations can thus tell historians much about the conditions that individuals encountered in their formative years. Given sufficient data it is possible to glimpse inside households in order to piece together a history of the impact that declining wages, rising prices, improvements in sanitation and diminishing family size had on mean adult stature.
However, many studies of height and nutrition in 18th- and 19th-century Britain focused on military records and therefore had little to say about women. The turn to using the rich records of heights for men and women (and children) in 19th-century penal records has been more recent.
Today’s post is going to look at height patterns in four Digital Panopticon datasets, mainly using a kind of visualisation that many historians aren’t familiar with: box plots. If you’ve seen them and not really understood them, it’s OK – I didn’t have a clue until quite recently either! And so, I’ll start by attempting to explain what I learned before I move on to the actual data.
A box plot, or box and whisker plot, is a really concentrated way of visualising what statisticians call the “five figure summary” of a dataset: 1. the median average; 2. upper quartile (halfway between the median and the maximum value); 3. lower quartile (halfway between the median and minimum value); 4. minimum value; and 5. maximum value.
Here’s a diagram:
The thick green middle bar marks the median value. The two blue lines parallel to that (aka “hinges”) show the upper and lower quartiles. The pink horizontal lines extending from the box are the whiskers. In this version of a box plot, the whiskers don’t necessarily extend right to the minimum and maximum values. Instead, they’re calculated to exclude outliers which are then plotted as individual dots beyond the end of the whiskers.
So what’s the point of all this? Imagine two datasets: one contains the values 4,4,4,4,4,4,4,4 and the other 1,3,3,4,4,4,6,7. The two datasets have the same averages, but the distribution of the values is very different. A boxplot is useful for looking more closely at such variations within a dataset, or for comparing different datasets, which might look pretty much the same if you only considered averages.
These are the four datasets:
HCR, Home Office Criminal Registers 1790-1801, prisoners held in Newgate awaiting trial (1226 heights total, 1061 aged over 19)
CIN, Convict Indents 1820-1853, convicts transported to Australia (17183 heights, 14181 over 19)
PLF, Female prison licences 1853-1884, female convicts sentenced to penal servitude (571 heights, 535 over 19)
RHC, Registers of Habitual Criminals 1881-1925, recidivists who were under police supervision following release from prison (12599 heights, 12118 over 19)
For each dataset, I only included women who had a year of birth, or whose year of birth could be calculated using an age and date, as well as a height. (I say “heights” above because I can’t guarantee that they are all unique individuals; but nearly all of them should be.) In all the following charts I’m including only adult women aged over 19.
Here’s what happens when you plot the heights for each birth decade in RHC.
(This is generated using the R package ggplot2 , and it looks a little bit different from many examples you’ll see online because ggplot has a nice feature to vary the width of the boxes according to the size of the data group.)
The first thing I look for is incongruities that might suggest problems with the data, and on the whole it looks good – the boxes are mostly quite symmetrical and none of the outliers is outside the realms of possibility (the tallest woman is 74.5 inches, or 6 foot 2 1/2, and the shortest is 48 inches), though I’m slightly doubtful that there were women born in the 1800s in this dataset, which gets going in the 1880s; still, they’re a very small number so unlikely to skew things much overall. Since the data seems to be OK on first sight, the interesting thing to note here is that from the 1850s onwards, the women are getting taller, and those born in the 1890s are quite a lot taller than the 1880s cohort. This is fairly consistent with Deb Oxley’s (more fine-grained) observations of the same data.
Here’s CIN:
Again, we have a reasonable spread of heights and fortunately very small number of slightly questionable early births. (It happens to be the case that this data was manually transcribed, whereas RHC was created using Optical Character Recognition – but on the other hand, the source for RHC was printed and much more legible than the handwritten indents.) Ignoring for now the very small groups before the 1770s, the tallest decade cohort of women in this data is those born in the 1790s and thereafter they get consistently shorter.
Let’s put all four datasets together! (click on the image for a larger version)
I’ve filtered out women born before 1750 and after 1899, because the numbers were very small, and some extreme outliers (more about those later…). Then I added a guideline at the median for the 1820s (the mid-point), as I think it helps in seeing the trends.
It might seem surprising at first that the late 18th-century women of HCR are taller than any subsequent cohorts until the 1890s. Yet the trends here are broadly consistent with the pioneering research by Roderick Floud et al on British men and boys between 1740 and 1914. They argued “that the average heights of successive birth cohorts of British males increased between 1740 and 1840, fell back between 1840 and 1850, and increased once again from the 1850s onwards” (Harris, ‘Health, Height and History’). The British population was less well-fed for much of the 19th century (as food resources struggled to keep up with rapid population growth), and it got smaller as a result. Our women’s growth after 1850 may be slower than for the men (until the 1890s) though; perhaps it took longer for women than men to start growing again.
Finally, though, I have to put in a big caveat about the HCR data. I mentioned that I excluded some extreme outliers from the chart above. HCR was by far the worst offender, and if you look closely at the 18th-century cohorts covered by HCR, the boxes aren’t quite as symmetrical as the 19th-century ones. If we visualise it using a histogram (another handy one for examining the distribution of values in a dataset), we can see more clearly that there’s something up. A ‘normal’ height distribution in a population should look like a “bell curve” – quite tightly and symmetrically clustered around the average. CIN and RHC are close:
But this is what HCR looks like. This is not good.
If we’re lucky, much of the problem could turn out to be errors in the data which can be fixed. After all, it’s at least roughly the right kind of shape! The big spike at 60 inches (5 feet) rings plenty of alarm bells though. It looks reminiscent of a problem we have with much of the age data in the Digital Panopticon, known as “heaping“, a tendency to round ages to the nearest 0 or 5 (people often didn’t know their exact dates of birth). The age heaping is very mild in comparison to this spike, so I think it could well be another issue with either the transcription or the method used to extract heights. But if it turns out that’s not the case, this could be pretty problematic. We’re assuming the prisoners were properly measured, but we don’t know anything about the equipment used. For all we know, it might often have been largely guess work. In the end, we might find that HCR simply isn’t reliable enough to use for demographic analysis. There’s very little height data for women born in the 18th century, so this is a potentially really important source. But what if it’s not up to the job?
H Maxwell-Stewart, K Inwood and M Cracknell, ‘Height, Crime and Colonial History’, Law, Crime and History (2015).
Deborah Oxley, David Meredith, and Sara Horrell, ‘Anthropometric measures of living standards and gender inequality in nineteenth-century Britain’, Local Population Studies, 2007.
Today I want to go on an excursion in “catalogues as data“. The UK National Archives’ Discovery catalogue is an excellent resource for this activity, because a) it has a lot of records that have document descriptions at ‘item’ or ‘piece’ level in the catalogue, containing quite structured information (like dates, places, occupations) that can be quantified and visualised; and b) even more importantly, it has an export function that allows you to download up to 10,000 records in CSV format. (It also has a full API for those with some programming skills, but 10,000 records will get you a long way, and you can often break up larger collections into chunks, eg with date filters).
You’ll need to use the Discovery advanced search quite carefully to get the right set of search results (it enables specification of particular records, dates, catalogue level, etc) – there are some useful tips here. Then you’re quite likely to need to use a tool like OpenRefine to separate out pieces of information into separate data fields and clean/normalise dates etc (check out this tutorial).
the service records of more than 7,000 women who joined the Women’s Army Auxiliary Corps (WAAC) between 1917 and 1920… The WAAC became the QMAAC in April 1918 and was disbanded in September 1921
At 7000 records, this sounded like a good size set to play around with, well within the download limits. And a look at a catalogue entry showed that it has some nice information beyond women’s names (unlike a similar and larger series, WO399, which has only transcribed names). Given just a few hours work extracting and cleaning the data, what could I learn?
Record for
Aaron, Sarah Ann nee Phillips
Place of Birth:
High Street Cefn Mawr, North Wales
Date of Birth:
22 August 1894
First, what does this actually offer in terms of usable data? The date of birth is an obvious one: closer inspection shows that it’s in a consistent format where there’s a full date (the majority); at least a year is provided in almost every case, and that can be extracted into a standard year of birth field quite easily. Place of birth also has potential, but it’s more varied and needs more cleaning, so I haven’t done anything with that yet; but it could make for an interesting mapping exercise. Less obviously perhaps, “nee Phillips” suggests that – if you can safely assume women always gave this information! – it’s possible to also infer something about whether a woman was (or had been) married. Another nice little thing you could also potentially do, given birth dates and first names, is to look for patterns in baby naming (although this might really need a larger dataset).
Two caveats, one major and one more minor:
The online guide makes it clear that these 7000 records are only a small minority of the original collection (57000 records), as many were destroyed in a WW2 air raid. So it might not be representative of the women recruited.
Errors in the data – which you always have to look out for, even in the best quality material. In this case, there were a few obvious transcription errors in the birth dates. We can be 100% certain that birth years of 1822, 1917-18 and 1988 are just wrong. But actually more problematic are outliers that look unlikely but not quite impossible: 1844? 1903? Fortunately, they account for a tiny number of records. There were also 278 recorded as numbers like 18880 or 18930: I concluded that these were actually meant to be year dates to which somehow an extra zero had been added and corrected them accordingly.
Visualisation is often particularly useful for highlighting errors and problems in your data. But it’s the researcher who has to decide what to do about such anomalies (and whether they might even be serious enough to make the whole dataset too unreliable to be worth using).
I initially hoped that the record dates would represent specific dates when women joined up, but as it turned out there was only a covering date for the series as a whole. Since it only covers 4 years, that’s not really an issue; instead I simply worked out their ages in 1918 (assuming that there wouldn’t have been new recruits after the war ended anyway), and filtered out the half-dozen supposedly born before 1860 or after 1903.
And so the thing I learned today is that, gosh, they were so young.
As visualisations, tables may be less eye-catching than graphs, but they have the virtue of presenting a lot of precise information in a relatively small space; the table at the bottom of this post shows that more than 60% of the women were aged 25 or under in 1918 and about 90% were under 30. Very few of them were old enough to take advantage of the limited extension of voting rights to women at the end of the war.
This is confirmed by a bit of background reading – according to Lucy Noakes on Women’s Mobilization for War (Great Britain and Ireland), “the majority of recruits to the WAAC were young working class women”. If we can reasonably assume that the information given about maiden names is a complete record, or anywhere near it, the vast majority of the women were also unmarried – nearly 95% of them overall. I suspect that very few married women would have volunteered for this type of service (which was likely to take them overseas and close to combat), and as a result it might be expected that the majority would be young – very likely younger, on average, than male soldiers. You can also see that a considerably higher proportion of the women aged over 25 were/had been married – but it still looks a very low proportion compared to what you might expect in the general population (and I wonder if quite a lot of these were widows).
I’m not exactly surprised to learn from Noakes that their youth (and, no doubt, class) resulted in some negative perceptions:
In the public mind however, they were sometimes perceived as thrill seekers, drawn by a desire for adventure and romance, and recruitment to the service suffered from fears that women were finding opportunities for sexual liaisons with the soldiers. So worried was the government by these rumours that a Commission of Enquiry was formed, which included figures showing the number of pregnancies amongst unmarried members of the WAAC was lower than among unmarried civilians…
The second instalment in this series of data visualisation posts for Women’s History Month 2018 looks at the World BankWorld Development Indicators (WDI). This massive collection has data in several categories: demographic, education, work, poverty, health. It includes both country-level data and various aggregates by different criteria: geographical regions, income levels, etc. The UK Data Service has a useful guide as well as access to the data. You can also download it directly from the World Bank website (and it has an API which I haven’t tried), and there are tools like R packages.
A lot of the data is relevant to women’s and gender history, so much so that gender has its data portal. I’ve selected just a handful of significant indicators with the most comprehensive coverage (life expectancy, fertility, education), and I’ve done two series of graphs: the first uses the World Bank income level groupings, and the second takes a selection of six countries (chosen because they are varied geographically, culturally, in terms of income and in their data patterns, and because they have good data coverage, not for any kind of representativeness).
I’m sure there are no surprises here for people who study global development, but for me at least it’s been an educational experience. There seems to be quite a lot of good news for women in this data. The bad news is the sheer levels of inequality between income regions in many of the indicators.
Life expectancy for men and women
Life expectancy is one of the most long-running series in the data; most countries have it from 1960 onwards. This is a ‘faceted’ graph comparing female and male life expectancy at birth in the five income groups (and the world as a whole). The familiar observation that women live longer than men is not just a “Western” phenomenon, although it appears that the wealthier the country, the bigger the gap. The level of the continuing gap between the richest and poorest countries is one thing that has not much changed.
The second graph uses the same format to look at the six countries, and at country level there is more variation – even dips or periods of stagnation that counter the general upward trends.
(Oops. I forgot to make nice labels for the y axes. That should be “years from birth”.)
This graph shows the data for the six countries in a different way. I think it’s a bit less clear in some respects, but it’s useful for comparing the countries at particular times, and how their trajectories vary.
Fertility rates
Fertility rates are also widely available from 1960. Women everywhere are having fewer babies than they were 60 years ago, but the most rapid falls have been in middle income countries. Again, the six countries show more variation.
The education of girls
The World Bank’s earliest education data starts a bit later than the demographic indicators, from 1970 onwards. The most notable feature of this is the convergence across every income group at primary school level, and secondary education is not far behind except in the poorest countries. (Watch out for the difference in the scale of the y axis, and the perspective, of the two charts…)
There are a few more gaps appearing in the country data, especially in the secondary school data. Gaps can be frustrating of course, but they’re important for highlight questions I’d need to be asking about how the data was collected, and about the calculation of aggregates.
Primary education
Secondary education
Teachers
Mostly, employment data in the WDI doesn’t get going until the 1990s. But the education data does contain some information about the gender of primary school teachers. (One of the oddest gaps in the country data is that this isn’t recorded for Norway. The data for secondary school teachers is even bittier and I decided not to include it.)
I already knew that in Britain primary school teaching is a heavily female profession, but I didn’t quite realise how far this extends across high income countries generally. The global trend seems to be in the same direction, but much more slowly in the low income countries group.