Sustainability, Web2.0 and a bibliography

The news about the RHS Bibliography of British and Irish history has unsurprisingly provoked considerable discussion and criticism. I want to follow up my last post with a few comments.

As some have already pointed out, basically the reason this is happening is because the funding structure for online resources in the UK (I don’t know about anywhere else) does not take into account the resources needed to continue to maintain online material in the long term. Even if you never update or add to your resource after publication, you have to pay for hosting. Servers fall over from time to time and need human intervention to get them restarted. Databases can get mysteriously corrupted and need rescuing (you have to keep backups as well). You have to keep your database secure from the legions of spammers and vandals and their bots (may they rot in hell).

A bibliography, however, does also need to be regularly updated. And that’s only one problem.

Yes, technically you can scrape the RHS bibliography, extract all its data and re-publish it somewhere. (Bill Turkel has already provided instructions; it’s a doddle.) If you do that you’ll be breaking the Terms & Conditions and infringing copyright. You can try it if you want, but the new owners aren’t going to like it, and they’ll have more money than you for lawsuits. Do you want to take them on?

And I’m going to say this flat-out, without equivocation: there is no way that you could build an equivalent source from scratch using Web2.0 methods. I’m extremely doubtful that you could even keep it properly updated that way. Because we’re running right up against the limitations and weaknesses of Web2.0 and crowdsourcing here.

A major part of the value of the RHS bibliography is that it aims, however imperfectly, (a) to be comprehensive and (b) to use structured, systematic classifications. It’s not just a keyword search.

Now, my own recent experience with wikis is that people are pretty good at providing content but largely terrible at doing structure and order. And those are vital for an online bibliography.

Bibliographies are very complicated structurally. (This is why there aren’t that many web bibliography applications out there…) There are so many different types of publication you have to take into account: even the most basic – books (authored and edited), journal articles and book chapters – necessitate a pretty complex database structure. Take a look at the array of BibTex formats.

(I’ve created online bibliographies using specialised bibliography tools [link dead] and customised mediawiki plugins [link very dead]. It’s not easy. Actually, it’s time-consuming and bloody hard work. I enjoy it, but I’m weird that way.)

Web2.0, crowdsourcing, folksonomic tagging, can do a lot of things. But it’s all kind of haphazard and serendipitous. Dan Cohen and Roy Rosenzweig warned us, in the context of collecting primary sources online but it also applies here:

Collections created on the web through the submissions of scattered (and occasionally anonymous) contributors do have a very different character from traditional archives, for which provenance and selection criteria assume a greater role. Online collections tend to be less organized and more capricious in what they cover.

A capricious, disorganised bibliography is not very useful to scholars.

* * *

Well, that’s the pessimistic post. I’ll try to do a slightly more constructive practical one later with some ideas and resources…

8 thoughts on “Sustainability, Web2.0 and a bibliography”

  1. Don’t, dear all, say “What about Zotero?”

    I don’t know (yet) the answer. You could scope it out yourselves and think about how it would work, and let the rest of us know. Otherwise you’ll have to wait until I’ve got time to do it.

  2. Ta for this, Sharon. I see your points entirely. For me, the interesting thing is: can we combine a Web 2.0 and a File Card approach? I could for example:

    (1) Set up a Pledgebank bet ‘I will put all my refs online if 20 other historians who’ve been in the game for 10+ years also promise to.’
    (2) Librarything my thousand or so references (AKA – comprehensive bibliography from stuff that I have actually published) online
    (3) Dump, with all the other registered participants (.ac.uk; .edu only, in order to make quality control easier) the list onto a virtual uber-list somewhere.
    (4) [The difficult bit] we need to tag the data with the RHS standard pull-downs. I’ve done a bit of this (for the Urban History Bibliography) and it isn’t too time-consuming. Sometimes you need the text but you can do most of it from the title. To do this requires setting up an institution of a kind. Of the kind that historians donate hundreds of hours of their time to every year. Luckily, this institution would never need to meet in person and could make use of expertise from every corner of the world.

    Then there’s the question of what to do with the back catalogue. The RHS own (have sold) the information, so we would have to work out some way of building it up again from scratch.

    Here, academics are the main providers and the (main though not sole) users. We are also the filters. We don’t need these references or their tags to be anyone’s intellectual property. There is a cost to maintain this, but it ought not be prohibitive.

    Tim – I do think that the AHRC needs to consider supporting this resource. It’s giving a lot more value-added to historical research than almost anything else that costs the same.

  3. It seems to me that bibliography databases would be an excellent candidate for micropay subscriptions. I don’t want to pay dozens or hundreds of pounds (well, dollars, but I’m trying to get into the spirit) for access to something that I’ll actually only use a dozen times or so. Make it ten search queries for a pound, though, and very, very few people who need it won’t have access.

  4. Is there a toolkit for micropay which can be set up with less than 40 hours work and maintained with 1 hour a week?

  5. Dear Chris, on the RHS, the difficulty is the cost of updating the bibliography runs into six figures, and if the cost was continued would effectively close the Society; which besides the bibliography also has commitments in regard to providing grants (it is the largest provider of small research grants to post-graduates historians, and for conferences), and as a publisher (not to mention its role as a public advocate for history). I don’t want to pour cold water on anything, but I am with Sharon on this at the minute.

    I suspect that if we were starting from scratch the idea of pooling collected references might have legs, but the labour of ensuring consistency would be huge.

  6. Hi Tim,
    As far as I’m concerned, the big problem here is that payment for a resource is being ‘de-pooled’: now, the taxpayer will still be picking up most of the tab for this resource, but in an inefficient way. Extra resources will be needed to police the licenses, and this role is likely to soak up any income from non-taxpayer sources. Of which, while the British Council is trying to spread British culture globally by making it more accessible, the AHRC are putting up a wall around it.

    I’m aware that there are massive cuts in all aspects of UK public services right now (20% round my way) – but is this really the best thing to cut? Me, I’d chop a research project a year instead.

    I would imagine that in order to minimise the negotiation costs we will end up doing a subscription deal via JISC in any case: so the main losers will be the public, who are, one way or another the ultimate funders.

  7. Dear Chris, I think the issue of ‘de-pooling’ is a real one, but that there is no good model out there at the minute. In some ways, the net promised cost free pooling, but then let nation states pay for the delivery of the goods (which they did for the first ten years). This is one of those instances in which the basic contradiction in the assumptions of all the actors involved, has come home to roost.

    Some states (Ireland, Sweden, Canada) are doing more to keep up the illusion of a public pool (perhaps a municipal Lido), but even these funding streams are looking ever more shaky, and everywhere the systems for delivery of public online resources is looking far from robust. Google and advertising models create the illusion of a ‘free’ resource, but it is an illusion.

    In the case of the RHS Bibliography (and several other similar resources) the issue is further complicated by the Society’s charitable trust status. In other words, the question for the bibliography and the Society is not between state funded or commercial, but between funding through a charitable resource (with the implication that has for other activities), or delivery of the same end product to 95% of its users through a commercial agreement.

    Is there a broader issue and crisis, yes; is this one way through a very difficult situation, yes. Is it comfortable, no.
    Tim

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.