You are currently browsing the tag archive for the 'Digital history' tag.
There is an interesting discussion going on over at the SHARP e-mail list about the differences between reading on paper and reading on a screen.The conclusion of most posters is that while we may not need a new word to describe reading on a screen – viewing? screening? diging? – there is nevertheless a difference between the two. Defining that difference, on the other hand, is a bit harder and is something many scholars are still thinking about.
I blogged about this a while ago, in the context of Early English Books Online (EEBO) and whether reading seventeenth-century pamphlets on screen can change how you read them. Prompted by the SHARP discussion, I’ve been doing some more thinking about this. It occurred to me that this has been an interpretative issue since before the creation of EEBO and other digital reproductions of sources. Microfilm versions of pamphlets also carry with them some of the same issues.
In the case of the Thomason Tracts, for example, a microfilm edition by University Microfilms International (UMI) has existed since 1977. This is the way that most scholars have probably read them between that time until a few years ago. Although you can print out pamphlets from both EEBO and from microfilm, both methods of access are primarily through a screen. So what are the similarities and differences between reading a pamphlet in its original format, via a microfilm reader, or through your computer monitor? And do the differences make any practical impact on how you absorb and understand the text?
My own answer is that I’m not completely sure, but I feel instinctively that there must be differences, which in turn must impact on the experience of reading. But I was worried that this instinct is more to do with the book historians I’ve been reading – for whom the importance of the reader is a pre-requisite – than anything that could be demonstrated empirically. So here are a few thoughts about how those differences might actually have a practical impact on reception.
One is colour. A bit obvious, perhaps, but microfilm often only reproduces texts in black and white. This is certainly the case with the Thomason Tracts, and in turn EEBO reproduces the microfilm edition of them so retains this monochrome reproduction. This can potentially blur the subtleties of early modern printing. Here for example are two images of the title page of John Milton’s Eikonoklastes (unfortunately I couldn’t find two versions of the same copy, although they are the same edition):


The notes page on EEBO does say that the title page is in red and black, and if you look closely you can distinguish in places where it must have been red. But it’s still very unclear. Why does this matter? One reason is in helping to distinguish between the impact of author and printer on the finished text. Was it the printer Matthew Simmons, or the author Milton, who decided to use red ink – which would have complicated the printing process significantly? Another reason is in thinking about the impact the text had on its readers. How would they have read the title page? Does it matter that the Greek letters are printed in a different colour, given that many readers would not have understood them? Does it matter that “Published by Authority” is in red, given the severe Licensing Act that the Rump Parliament had passed the month before publication had re-introduced pre-publication censorship. To answer these questions properly, you really need to look at the original edition.
Another is environment. The original Thomason Tracts have to be read in the British Library. Typically the microfilm version would also have to be read in a university library, unless you could persuade the librarian to run off copies. This imposes certain physical conditions, such as near-silence, the presence of other scholars, and the absence of other distractions. You can read EEBO at home in your dressing gown. I certainly work differently in libraries when I know I’m probably going to be there for most of the day, compared to at home where I might be snatching half an hour to have a look at something. Looking at EEBO, you also have the rest of the internet to distract you. You can imagine spotting things in one state that you might not in the other. One silly example of mine is searching late at night for something and forgetting that EEBO’s search engine doesn’t automatically include AND for strings of words. Two weeks later when I tried again at a more sensible hour I found what I was looking for. On the other hand, being able to read EEBO outside library hours does increase the time you have available to work on it. For time-limited projects like dissertations, this can make a big difference to the amount of texts you are able to read or the amount of analysis you are able to devote to a text.
A third is searchability. Apart from wider short-title catalogues, the Thomason Tracts have been catalogued at least three times: once by Thomason himself, secondly by G.K. Fortescue in a two volume edition published in 1908, and thirdly by the UMI microfilm edition. Before EEBO, you were reliant on these indexes, compiled by someone else with limited search variables, to find what you were looking for. Now you can search not just for author and title but also for subjects and keywords. Fortescue also altered Thomason’s cataloguing order and sometimes gives his own dates. In turn Thomason’s dates are more idiosyncratic than used to be thought, and don’t necessarily mean the day the pamphlet was actually published. The UMI catalogue then restored Thomason’s cataloguing. Using EEBO lets you search by Thomason’s ordering, but also by your own. Inevitably this gives you much more freedom to navigate the collection and find new things. Particularly powerful is the gradual conversion to free text that EEBO are making of early modern pamphlets. This in particular is still a greatly untapped feature when it comes to identifying links between texts, making authorial attributions, and so on. But while such freedom has its benefits – making connections that would perhaps not have been possible otherwise – it can also have its drawbacks in terms of making mistaken connections, as the story about William Lilly in the latest edition of Early Modern Literary Studies makes clear.
There is also the fact that pamphlets are three-dimensional objects made of particular materials. Again it is almost banal to point it out, but microfilm and EEBO reproduce these objects in two dimensions. Here is a title page from the royalist newsbook Mercurius Elencticus, singled out by Jason McElligott in his study of the later royalist newsbooks as an example of one printed on particularly thin paper:

You can partly deduce this from the digital version by the fact that print from the other side of the page has leached through, but you can’t get any real sense of comparison with other issues or other titles. Again, why does this matter? Partly because paper quality can tell us something about the cost of the title – how much the printer was prepared to invest in it, how much it sold for – and something about the audience – who could afford it. But in the royalist newsbooks’ case it also relates to the fact that they were produced underground in opposition to a strident Parliamentarian censorship regime, with limited access to raw materials, and printers had to make do with what they could.
Then there is the issue of resolution. All three types of media are ultimately viewed with the naked eye, but there are various ways they are mediated before we see them. Original pamphlets can be zoomed in on with a magnifying glass. Microfilm and EEBO versions can be zoomed in on mechanically or digitally. The resolution at which EEBO reproduces pamphlets could be an issue here – they can get slightly pixellated if you are looking at them at a particularly high level of zoom. On the other hand, it’s much easier to zoom on a computer than it is by hand. A ractical example of this is a pamphlet called The Perfect Politician about Oliver Cromwell, by a pseudonymous author. In his 1990 essay on Cromwell’s contemporaries, John Morrill identifies this as being by L.S.

It certainly does look like L.S. When you zoom in, though, it seems clear that it is probably by I.S and that L.S. is a misreading because of the full stop merging into the I.

The pamphlet is probably by John [Iohn] Streater, a radical and veteran of the New Model Army. Knowing this puts the pamphlet in a very different context. So the ease with which type can be examined through EEBO – despite issues with resolution – may well have an important role in bibliographic analysis of texts that have otherwise been well-examined.
These are some initial thoughts about the differences between original sources, microfilm and digital reproductions. I’m sure you’ll have more – what do you think? But in closing it occurs to me that all three have an important similarity. One thing that original pamphlet, microfilm and EEBO all have in common is a relatively static bibliographical apparatus. They all still draw on Wing’s Short-Title Catalogue of Books Printed in England, Scotland, Ireland, Wales and British America and of English Books Printed in Other Countries 1641-1700. Some of the attributions in Wing can be dubious. The Perfect Politician is a good example of this. Here is what the information page in EEBO says:
Attributed to Henry Fletcher by Wing.
Sometimes attributed to William Raybould.
A quick look at the title page makes it obvious that Fletcher and Raybould are the booksellers, not the authors.

This misattribution is fairly easily sorted out. However there are others where it’s not so clear, or where recent scholarship has moved beyond Wing but EEBO doesn’t reference this. For me a great improvement to EEBO would be to give users the ability to set up an account with a real-life identity and let them annotate texts. You would know which scholars were working on something of interest to you; you would be able to flag where you disagreed with an attribution, giving reasons; and you could contact the person who’d made an annotation to ask them about any attributions you were unsure of. Until bibliographical catalogues go properly digital, there will remain this odd juxtaposition between digital texts and analogue descriptions.
A few bloggers have recently been posting their thoughts about two works on digital history:
- Susan Schreibman and Ray Siemens (eds.), A Companion to Digital Literary Studies, Oxford: Blackwell (2008).
- ‘Interchange: The Promise of Digital History’, Journal of American History, 95, 2 (2008).
Two points stood out in particular when I read them. The first was on how digital sources shape an audience’s experience of them. As Patrick Gallagher put it:
Oral and video histories, understood as artifacts, have become very important for bringing visitors closer to the reality of a story. (JAH, 109)
The discussion in the JAH was mostly in relation to the wider public accessing historical sources. But can digital sources also alter the reality we as scholars reconstruct from a source? This is not something really considered by the JAH discussion. By contrast, contributors to the Companion were very alive to this, with Bertrand Gervais asking:
Does a literary text retain the same status once it has become virtual? What is the status of any text in today’s era of hypertexts and linked computers? What type of materiality are we dealing with? What forms of reading, what forms of knowledge? (Companion, ch. 9)
The second was the training graduate historians will need to thrive in a world where digital history is commonplace. As Steven Mintz put it:
Many search committees are favorably impressed by graduate students who hve developed online resources or an electronic portfolio. We have a responsibility to give our grad students the training support they need to meet these rising expectations. (JAH, 216).
Both points have made me think about how they apply to the digital source that, without a doubt, I use the most: Early English Books Online.
EEBO is a tremendous resource. It preserves sources that are fragile and which risk deterioration in the coming years. It greatly broadens the accessibility of early English printed works. It makes it far quicker to find and read texts. The power of its search engine makes it possible to carry out in minutes analysis that would previously have taken days – particularly with the increasing number of e-text transcriptions being produced by EEBO-TCP. As a part-time student, I would find it difficult to do my Masters without it.
But for anyone studying written communication in early modern England, EEBO brings with it its own historiographical and epistemological challenges.
First, the sheer convenience of EEBO might risk distorting our perception of early modern written communication. The last ten to fifteen years have seen a huge expansion in interest in print culture, particularly in cheap print. But work by a number of literary critics and historians – synthesised by Harold Love in his Scribal Publication in Seventeenth-Century England – has also reminded us of the importance that manuscript retained in English culture during the seventeenth century. So one question that the rise of digital history throws up is what impact might the relative availability of sources have on future critical and historiographical trends.
Secondly, digital reproduction of a text inevitably changes the way that we approach it. Texts cannot be fully understood without reference to those who wrote them, those who produced them, those who read them, and to the form that the texts took. As Joad Raymond has put it:
The meaning of a text is the transitory product of a particular relationship between a reader or group of readers within specific circumstances, who encounter not texts but books. In this creative encounter the material construction of a book, its typography, binding, the feel of the paper, the situation in which it is read, whether silent or out loud, in a library, a crowd or a secluded room; in youth or in age; patiently or urgently; in a cloistered or revolutionary world; all these play upon the meanings which a reader and a text can produce between them. (Raymond, The Invention of the Newspaper, pp. 2-3).
In reading physical copies of early modern pamphlets, we are already many steps removed from the experience of contemporaries reading them. We can perceive the range of meanings they might have carried only through a glass darkly. But does removing the physical, material interaction with a text further distance us from the ability to reconstruct those meanings? The quality of paper, the size of sheet used, the colour of the ink – all of these are factors which can influence how a text is read or perceived. Reading them on a screen today is inevitably a different experience to reading actual copies.
None of this is to diminish the importance of EEBO and other pioneers of digital early modern history. But it does make me wonder how best to assess the impact of digital history on early modern studies. It is likely that it will push historians in some directions rather than others. If so, it will be important that today’s generation of grad students are equipped not only with the right programming skills, but also with the right skills to engage with the implications of digital history for historical and critical theory.
I posted previously about being inspired by Digital Scholarship in the Humanities to mess about with word clouds. The same post also gave me the idea to try some text comparison tools.
TAPoR’s Comparator tool allows you to type in the URLs for two different pieces of text. It then compares the two, producing a word list showing whether words appear in both.
I tried it out with two texts in the pamphlet battle between John Taylor and Walker of 1641 that I’ve been looking at recently. Late in the summer of 1641, a text called The Irish Footman’s Poetry appeared by a third author – one George Richardson. The text referenced various previous pamphlets in the dispute. Although it appeared when Taylor was on a journey down to the south-west of England, it is often attributed to him. (No real George Richardson appears to have existed).
I ran Richardson’s text through the tool alongside one of Taylor’s pamphlets from the dispute. I had a hazy idea in my head that this could just possibly be a magic tool that could tell me the real author of a pseudonymous text.
Unfortunately it didn’t tell me very much. What it gives you is a list of words that occur in both texts, and the ratio with which they occur in both. In some cases I can imagine this being very useful – for example to trace the transmission of texts in cases where later works references or draws upon previous works. In my case, though, the only words that emerged in common were everyday verbs like “do”.
Then I tried doing two separate sets of more detailed analysis using the HyperPo tool. Here are the results for Taylor:
- Total words (tokens): 1813
- Unique words (types): 785
- Highest word frequency: 91
- Average word frequency: 2.31
- Standard Deviation of word frequencies: 5.07
- Average word length: 4.29
- Standard Deviation of word lengths: 2.11
- Number of sentences: 44
- Average words per sentence: 41.2
- Number of paragraphs: 17
- Average words per paragraph: 106.6
Here is the same analysis for Richardson:
- Total words (tokens): 1841
- Unique words (types): 726
- Highest word frequency: 86
- Average word frequency: 2.54
- Standard Deviation of word frequencies: 5.29
- Average word length: 4.35
- Standard Deviation of word lengths: 2.22
- Number of sentences: 95
- Average words per sentence: 19.4
- Number of paragraphs: 38
- Average words per paragraph: 48.4
Again not much stands out – in any case trying to look for similarities this way could be distorted if, for instance, the same author was deploying different literary styles in each text.
So, TAPoR’s tools were fun to try out, but not much help in this particular case – a far better way to establish who the real George Richardson might have been is through a detailed contextual, bibliographic and stylistic analysis of the text. That said, I’d still recommend having a play about with TAPoR’s wide range of tools since you may well find something of use.
A very useful post the other day from Lisa Spiro at Digital Scholarship in the Humanities, covering two things:
- Using word clouds
- Text comparison tools
I’ve been messing around with both over the last couple of days. Below are some thoughts on uses of word clouds.
Word clouds are a useful visual representation of the frequency with which a word appears – the bigger the word in the cloud, the more it appears in the text. They’re often used for blogs to represent tags the blogger has used. I’ve got two in the sidebar on the right, one for the categories I sort my posts into and one for the tags I’ve used.
Words clouds aren’t horribly difficult things to learn how to program. I’ve been following Bill Turkel’s wiki on how to become a programming historian and have managed to make my own using Python. But if you want to cheat, Wordle offers you a much easier way. Just cut and paste your text into the website and it automatically generates a cloud for you. You can then customise it within a range of styles.
How is this useful for historians? Well, I’m in the early stages of planning my dissertation and one use I’ve found has been to refine my topic. There are two extremes in choosing a thesis: you can start with a small topic and work your way up to finding the overall themes it will address, or start with a big theme and work your way down. If you’re choosing the former, word clouds can be a very quick and helpful way of distilling out key concepts.
As an example, I’ve cut and pasted the text for Henry Walker – one of the Civil War journalists and pamphleteers I’m hoping to study in my dissertation – from the Dictionary of National Biography.
What can you glean from this? “Perfect” and “Occurences” occur quite a lot, naturally enough given Perfect Occurences was a newsbook he edited. But what about other titles he edited? They’re less prominent. Is this something significant about Walker’s legacy, or does it also tell us something about his biographer’s priorities? “Trade” and “apprenticeship” also spring out – again, significant given that Walker started life as an ironmonger and did not spend his whole career as a parliamentary hack. This is a context sometimes ignored in his life. “Hebrew” also comes out quite strongly. Walker was fluent in it, but what significance should we read into this – is it of importance for understanding his writing?
Let’s compare this text to the biography of Walker in the early 20th century Cambridge Companion to English Literature.
Perfect Occurences is nowhere to be seen. “Cromwell” and “Charles” loom much larger in the cloud. “Drogheda” also looks quite strong, something that doesn’t emerge in the DNB’s cloud.
These are just a few of the questions that occurred to me when I generated this cloud. They’ve all given me leads to follow up or do more thinking about, both in relation to Walker and the historiography surrounding him, and I was able to do it instantly without a detailed trawl through the text. Now in Walker’s case his biography is very short, and naturally you would go through it in detail anyway – but for much longer texts, I can see Wordle having even more potential. With the set of key words it generates, you can then go trawling through other resources such as JSTOR and the RHS bibliography, looking for additional relevant secondary works. It’s not a substitute for reading and analysing a text yourself in detail. But it does provide a very useful supplement, particularly if you are trying to summarise a text.
Next time I will give some details about the uses I’ve made of text comparison tools.
Bill Turkel and Alan MacEachern’s new book – The Programming Historian, in the form of a wiki – is now officially up. Bill was kind enough to let me be involved in peer reviewing it, and while I’m a programming novice I’ve found it very easy to pick up. Thoroughly recommended, do check it out. I’ll be blogging more about my experience of learning the basics of Python via the book’s tutorial once I have a bit more time and have finished moving house…






Recent Comments