Manuscripts from the sixteenth and seventeenth centuries are going digital with added features for users! The launch of a beta website for Early Modern Manuscripts Online next month will provide encoded transcriptions to accompany manuscript images and metadata. The number of transcriptions will be limited at first (a few hundred letters), but the EMMO corpus online will grow over time into a broad resource for research on a variety of manuscripts.
As with the vetting process for transcriptions that I wrote a post about over the summer, there are many steps in preparing transcriptions for the web and building the interface through which users will explore these texts. Nothing can replace the feeling of examining one of these centuries-old documents in your hands, but the website will have some definite advantages over the more traditional tactile experience.
What sort of advantages? Several, actually.
For an obvious yet important one, the website is accessible from almost anywhere (via internet browser), and at any time. Visitors need not make travel arrangements to the library in Washington DC nor come during normal operating hours. Capitol Hill is a picturesque neighborhood in which to wander, of course, and the Folger’s reading room is a singular space with an abundance of knowledgeable colleagues, but still, expediency may sometimes outweigh these factors.
Perhaps an even greater advantage lies in the enhanced readability features of the materials online. Images of rare manuscripts from the Folger’s collection have been available for some time in LUNA, our digital image repository, but the images by themselves have limitations. Many interested users face a significant hurdle since these documents were written in older scripts such as secretary hand that can be difficult to comprehend today unless one has specialized training. See a brief example below of an image with writing that may or may not be instantly clear.
After our hartie comendacions./ Whereas we are geven to vnderstand
that in your proceedinge with the Recusantes of that Countie of Surrey
Knowing the ropes of paleography will certainly deepen one’s experience of looking at these images, but not everyone has had an opportunity to pursue this subject. EMMO provides a way around such barriers even as it promotes the study of paleography. Like the transcription provided above, the interface will display the text of the manuscripts in a familiar, modern font. Furthermore, the transcriptions follow established semi-diplomatic conventions to help twenty-first century readers read the words more easily. This means that a few small changes have been made between the text that appears on the manuscript and the transcription, such as expanding abbreviations that were common then but not so common now (e.g., wth becomes with) and replacing archaic brevigraphs with modern equivalents (e.g., the graph at the end of Recusantes). These minor expansions and changes will be displayed on the site in italics and have been encoded directly in the underlying XML (extensible markup language).
Another major advantage of the EMMO site is that the full text of the semi-diplomatic transcriptions will be searchable. As someone who has read and/or transcribed many, many manuscript pages in the Folger’s collection, I can attest that the content of these documents is frequently fascinating and can often lead to new insights or unexpected discoveries. However, when trying to find a particular reference quickly, one may appreciate keenly the ability to comb through a mountain of material in seconds.
Electronic searching methods are not perfect, however, and some difficulties remain. With our particular texts for EMMO, spelling presents a challenge. Semi-diplomatic transcription conventions, as mentioned earlier, only make minor changes to the text. The original spelling of words is preserved. To give a few common examples, this means that the semi-diplomatic transcription may have ‘loue’ instead of ‘love,’ ‘iudge’ instead of ‘judge,’ ‘yf’ instead of ‘if’ and ‘preeste’ instead of ‘priest.’ Let the repercussions of that sink for in a minute (and if you want to hear more about the idiosyncrasies of early modern spelling, see this post that appeared on our partner project’s site, Shakespeare’s World). Original spelling retains much of the character of these manuscripts—which is wonderful—but the lack of standardization could lead to problems for those entering search terms.
To mitigate the spelling issue, EMMO transcriptions will include a regularized version of many words for searches. Did we have to go through word by word for that? Sort of, but we had a little help. The VARD program is designed to deal with spelling variations in EModE texts.
The example above shows how the process works; variants are highlighted by the program with options based on a likelihood percentage. In the underlying XML of the transcription text (following TEI-P5 guidelines), it would appear something like this:
<choice> <reg>tower</reg> <orig>towre</orig> </choice>
So, if a user were to search the site’s online corpus for ‘tower of London’, the letter in the example above (L.b.669) would appear in the results listing, despite the non-standard spelling of ‘towre’. Caitlin Rizzo, encoding specialist for the project, has taken the point on ‘VARDing’ EMMO transcriptions and creating options from VARD results in the XML. She says it takes about ten minutes to check through a page of transcription text. EMMO is happy to have her onboard the team this year working with Mike Poston, the Folger’s all-around Digital Merlin, as we fortify text for the project to give users an additional edge when searching. More features are planned in the coming months; for example, names and places that are spelled in unfamiliar ways will be regularized with an authority file, but that feature will be implemented on the website at a later stage.
Indeed, with terms like VARD (VARiant Detector) and EModE (Early Modern English) not to mention EMMO, XML, and TEI (Text Encoding Initiative), one wonders whether humanists of the future will need help comprehending twenty-first century spelling and culture.
Some of the other added features for the new site include permalinks to specific transcription files for easy sharing and navigation, plus the use of an IIIF (International Image Interoperability Framework) viewer to display images.
If interested in taking part in the EMMO beta test of the website, please contact me. The official launch of the EMMO website is planned for the spring of 2017. Watch for further announcements here and on other Folger channels.