This is the second of a two-part series about the Westminster Coroners’ Inquests data. See part 1 for more detail about the source of the data, and my initial explorations of the summary data.
This post focuses more on the text of inquisitions (the formal legal record of the inquest’s findings and verdict). …
This dataset makes accessible the uniquely comprehensive records of vagrant removal from, through, and back to Middlesex, encompassing the details of some 14,789 removals (either forcibly or voluntarily) of people as vagrants between 1777 and 1786. It includes people ejected from London as vagrants, and those sent back to London from counties beyond
They’ve already written about this data in an excellent article (open access) and Crymble has blogged further about his ongoing research. (They have better visualisations too, so you could skip this post entirely and go to the real thing. Think of this as a taster.)
I want to focus on ways of visualising multiple categories of qualitative information – the more categories you want to compare at the same time, the more complex a dataviz has to be. In this case, I’ve got four categories to play with: gender, dates, countries of origin, and vagrant ‘types’. That’s to say, there are three types of individual in the dataset: leaders of family groups, their dependents, and single vagrants. The gender of the majority of dependents is unknown (most are children), so for most of this post, I decided to simplify things by filtering out all of the dependents to focus on the group leaders and singles. (As a result, because I’m ignoring about 500 wives who were counted as dependents, the following will differ somewhat from the work referenced above.) This resulted in 10963 individuals.
Overall, the gender ratio of the vagrants looks almost perfectly balanced (5438 female to 5525 male). But this hides some interesting variations.
Firstly let’s break it down by the year of the case. (There are some missing records, and the very small numbers in 1777 and 1779 in particular are due to these gaps.) Two things stand out: the numbers of both female and male vagrants rise rapidly in the mid-1780s; and women are in the majority each year until 1782, after which they’re overtaken by men.
Now looking at vagrant type. As soon as you have multiple categories, you can split up the data in different ways – the “best” can depend on the data and exactly what it is you want to show. So graph 3a compares the percentages of male and female vagrants for each vagrant type, whereas graph 3b shows the percentages of group and single for each gender. 3b highlights that the majority were single individuals – something you wouldn’t know at all from 3a. It also makes it clear that vagrant type was gendered – considerably more men than women were singles. 3a, on the other hand, is better if you want to know exactly what the proportions of men and women were in each type. Most often, if I had to pick just one of these, it’s likely that I’d plump for 3b, because I’ve already seen that overall there are very similar numbers of men and women. But it might be a harder choice if that weren’t the case.
Now, looking at country of origin (British and Irish vagrants only, as there were only a few from other countries ), further striking differences emerge. It’s hardly surprising that the majority of the vagrants came from England, but much more noteworthy that there was such a large disparity between Irish men and women.
Adam Crymble discusses what’s most likely going on, and it ties in with the particularly rapid increase in the numbers of male vagrants from 1783 shown in graph 1 – it’s probably the result of demobilisation after the American wars.
This says ‘demobilisation’ to me, and the male nature of most Irish vagrants suggests that this may have been a strategy for getting home after the war. Demobilisation was heavily centralized in London. Soldiers and sailors weren’t taken home; they were dropped off and left to find their own way.
Finally, I want to visualise the relationships between three categories in the data: gender, country and vagrant type. Mosaic plots are a more complex and less commonly used type of visualisation that can cram a lot more information into a single chart than you can with a bar chart. But, as with boxplots, that makes them a bit harder to interpret.
Imagine that you start with a single large rectangular block. For your first category, you divide it horizontally, and put the labels for each “level” (in this case there are two, F and M, for gender) on the left hand Y axis. As in the very first bar chart, we can see that the proportions of men and women are close to equal.
Then you sub-divide the two blocks vertically for your second category (country) and put the labels along the top X axis. So reading left to right along each gender block, the first vertical block = English, the second = Irish, third = Scottish and fourth = Welsh. Again, we can see that English vagrants are in the majority for both genders, and at the same time, how a much higher proportion of the men are Irish.
Finally you sub-divide the blocks once again, horizontally, for the third category (vagrant type), and the labels for these (group and single) go on the right hand Y axis. The biggest single category, then, is women from England who are single (Hitchcock et al argue the importance of short-distance female migration London to find domestic service for making up much of this). The smallest category is men from Wales who lead a group.
Male Irish and Welsh vagrants are more likely to be single than are men from England and Scotland, whereas a higher proportion of Irish and (even more so) Scottish women were heading groups. (Crymble has also emphasised how different the Irish and Scottish vagrants were.)
The use of colour and shading adds one final dimension, but it’s harder to interpret on first sight. The idea is to show statistical significance. What it boils down to is that blue means the square is bigger than would be expected by the statistical model; red means it’s smaller than the model would expect (and the darker the colour, the bigger the significance). The fact that the group-Irish-male box is coloured dark red (ie, smaller than “expected”) pretty much seems to reinforce what we’ve already observed. The group-Scottish-female box also stands out among the smaller blocks – suggesting that this is significant and might be further investigated.
However, it’s important to to understand whether what the statistical model “expects” is appropriate for the data we have. In medical research, where data collection is conducted according to carefully defined rules, it may be possible to be confident that a statistical significance means a “real” difference. For a historian it might simply be pointing to imperfections in the data! So it’s essential for historians doing data analysis and visualisation to get to grips with both the original sources and the statistics. I’m still grappling with the second part…
For Women’s History Month 2018 I plan to make some visualisations of women’s/gender history datasets. I don’t know how often this will happen yet, but hope there’ll be at least one a week.
I’ll blog about them here and post code and data over at Github. The datavizes won’t necessarily be very sophisticated or original, but I’ll try to highlight different kinds of graph and how they can be used to explore patterns in data and start to formulate ideas and stories a historian might want to tell. Some of the datasets are likely to be closely related to my research interests and recent projects, but I hope to find new material from various sources and have some fun with stuff I don’t know much about!
There are likely to be two types of post:
* focusing on women
* comparison of women and men’s experiences
Today’s opening instalment is of the second kind. I’ve started with this one because it’s data I recently created and this is my first chance to do something with it!
Eighteenth-century inquests were usually held within a few days of a sudden, violent, accidental or unexplained death, at a local alehouse, parish workhouse or the location of the death itself. The dataset I’ve created contains a wealth of data that could be visualised, including locations, dates and verdicts. Here I’m just going to focus on a few comparisons of patterns for male and female deceased, using a range of graph types which, as you’ll see, highlight information in different ways. The dataset contains a total of 2894 inquests. A very small number were of more than one person/gender unknown or mixed; I’ve removed those (and for convenience I’m going to pretend that for the remaining 2891, 1 inquest = 1 person, though there might be a few exceptions). 361 were children.
The first graph shows annual counts of male and female inquests, showing clearly that there were considerably more inquests for male decedents than for female. It also shows that there was a lot of variation in the numbers of inquests from one year to the next.
This is also clearly shown by a proportional stacked chart, which also shows that the gender ratio doesn’t vary greatly from year to year, despite the fluctuating totals. This is… curious.
When we take a look at the inquests on children, it can be seen that the ratios appear to be rather more even but also less consistent (which might be expected with smaller numbers). I’d need to look more closely at the numbers with this one, but it’s clear that there is less gender disparity. Why?
(Charts like the last two are brilliant for showing proportions, but you have to bear in mind that they hide the variations in actual numbers.)
Here is another type of stacked chart. The first graph showed the counts for each gender independently; this stacks them on top of each other so you can see both the total numbers and the gender proportions.
Now I want to take a look at whether there are gendered patterns in inquest verdicts. I can break this down in two different ways. The first shows a breakdown of verdicts for each gender. The most obvious thing to note here is that accidental deaths account for a much higher proportion of the male deaths, which seems likely to reflect that men were more likely to work in a range of dangerous manual trades. Meanwhile, a higher proportion of women than men committed suicide, were victims of homicide, or died of natural causes. (Another time, I might add a third column showing the pattern for all verdicts.)
My final chart flips the breakdown around to compare the gender ratios for each type of verdict. This helps to give a different perspective on to the gender differences.
Here it is again, but I’ve added a guideline showing the average male:female ratio (which is 72.6% to 27.4%). These two charts emphasise that compared to men and boys, women and girls are a) least likely to die in an accident and b) most likely to be victims of homicide – with nearly 40% of homicide verdicts, this is the one category that’s approaching gender parity. Considering that lethal violence was largely a masculine domain (both as offenders and victims), this is interesting – though we shouldn’t forget that homicides account for a small proportion of inquests.
So, based on these explorations I have a number of lines of enquiry to start investigating in more depth, including:
why are men so much more likely to die in circumstances that lead to an inquest?
the gendering of accidental deaths (will it turn out to be the case, as you might speculate, that men’s accidental deaths are more likely to be outside the home and in the context of waged work while women’s are more domestic?)
the differences between children and adults (is there generally a less gendered profile for children?)
I’d also look more closely at the annual fluctuations, and probably at seasonal variations as well