The Folger is thrilled to share the news that we are the recipient of a generous three year National Leadership Grant from the Institute of Museum and Library Services (IMLS) to create Early Modern Manuscripts Online (EMMO), an online searchable database of encoded semi-diplomatic transcriptions of all Folger manuscripts from the period 1500-1700.1

That’s the final product, anyway. Getting there is going to be quite an adventure for us, one that we plan to share with you on The Collation at regular intervals once we get up and running next year. We hope that EMMO will expand the textual landscape of early modern England, providing a corpus to explore on its own and to compare to other corpora of print works such as EEBO-TCP.

The most important goal of the project, since we are a library, is access. The transcription and encoding of manuscripts is as crucial to access as is cataloging and digitization. It might even be more so, since manuscripts are slippery and wide-ranging things that tend to defy simple categorization and that are often written in impenetrable hands.

stonley diary, V.a.459

An example of a difficult hand in a very interesting manuscript: a page from Richard Stonley’s diary, for June 18, 1581, in which he describes seeing a dwarf, an extremely tall person, and a baby with a huge head, all on the same day, at the Lord Mayor’s and at the Royal Exchange (Folger MS V.a.459, fol. 3v).

This is a problematic barrier. Most people simply can’t read secretary hand efficiently and accurately enough to be able to include unedited manuscript sources in their research. In contrast, printed sources are much easier to access and read online, and so researchers tend to rely more heavily on them, thereby obscuring the complexities of early modern England’s dual-text environment and hindering a full understanding of the period. We’d like to change that, by opening up our manuscripts to anyone who wants to read them, not just the folks who are trained in paleography.

That being said, paleographic training is more important than ever. EMMO will include an interactive paleography tutorial that will allow users to test themselves against any manuscript in the database, and the Folger Institute will continue to offer paleography training in a variety of forms. In fact, there’s one such course coming up this summer.

So how are the transcriptions going to become part of EMMO? Anyone who takes paleography at the Folger for the foreseeable future will contribute transcriptions to the database as a matter of course, and transcriptions will be gathered in a variety of other ways as well, including crowd-sourcing, transcribathons, two full-time grant-funded paleographers, and interns. We will start by transcribing (and vetting) the 22,000+ images of manuscripts that are already in our image database, which means lots and lots of family papers and letters, as well as diaries, poetical miscellanies, literary works, and commonplace books. As new images are added, the corpus will grow.

And what will people do with all of these transcriptions? Our texts will be both human-readable and machine-actionable, encoded in TEI P5 and searchable in both normalized and original spelling—the research possibilities are really quite endless. Deeper analysis will be available via an application programming interface (API), and the corpus will be expandable by other institutions who want to use our software and be part of a federated search.

The grant officially begins on December 1. We hope to hire an EMMO project manager as soon as possible (a job posting will appear here and elsewhere in the next few days update 11/27: The job advertisement is now up.), and midway through next year, we’ll be hiring two project paleographers. We are developing a survey to find out what users and contributors would most like to see in EMMO, and will be putting out a call for beta-testers next year. Look for mini-updates on the Folger Research Twitter feed, @FolgerResearch, with the hashtag #folgeremmo.


  1. We won’t finish transcribing all of them in three years, but we plan to transcribe as many as we possibly can, and are fully committed to continuing the work as long as it takes. []

Author: Heather Wolfe

HEATHER WOLFE is Curator of Manuscripts at the Folger Shakespeare Library, and teaches early modern English paleography for the Folger Institute and Rare Book School.


  1. What a wonderful development! I do hope the mini-updates will be available elsewhere than just on Twitter. Not everyone loves Twitter. For those of us who teach paleography, access to the images as well as to the transcriptions will be of key importance.

    • We will certainly be including updates from the EMMO team here on The Collation as well as on twitter! (And, for what it’s worth, our tweets are also visible on the sidebar of The Collation‘s website.)

  2. This is so exciting! Congratulations on winning the grant for this important work, and thank you on behalf of all future EM English archive researchers. (Also, love the example you chose!)

  3. This is a very important project but I do have one concern. As you know the transcription of handwriting, especially Secretary hand, is never absolutely cut-and-dried and so the same passage may be rendered slightly differently by two different transcribers. Since, as you say, a large number of the users of this site will not be very expert at reading early hands and will, therefore, have to rely heavily on the transcriptions, my fear is that the transcriptions will become THE rendering of passages simply because it is in type and not hand. I think this is a problem and has any thought been given to how to deal with it? I’m not sure it can be dealt with and so your transcribers bear an even heavier responsibility than they might initially have imagined.

    • This is a big concern for us as well. A couple of points: All transcriptions will be reviewed and approved (or flagged as “not reviewed” if that is the case), and will always be viewable alongside digital images of the original manuscripts. If people notice mistakes, there will be a mechanism for contacting us and suggesting emendations. Of course there will be occasional slight differences in interpretation since transcription is an art, not a science. Also, we don’t want the transcriptions to replace consulting the original, or to replace the creation of scholarly editions.

  4. That’s awesome news!! I will wait patiently for the next three years. This kind of resource will be so important for my research. Congrats on the grant!

  5. I share Prof Williams’s concern. I’m dealing even today with a proper name that can be read in either of two ways. A transcription should accommodate such ambiguities, and not become (as Prof Williams says) THE transcription.

    • Yes, it is not our intention to smooth over ambiguities, since ambiguities are interesting and important. We aim to create highly accurate, uniformly consistent, and trustworthy semi-diplomatic transcriptions, while realizing that transcription can be a highly personal and subjective process, and that everyone tends to transcribe a little differently.

      Names and most words will be normalized for searching purposes, so regardless of how the name is spelled in the manuscript or appears in the transcription, you should be able to find it using the modern, authorized spelling, at which point you can decide for yourself how it is spelled in the manuscript if the transcription seems off in your opinion.

  6. Just wanted to chime in to say that one of the great things about TEI encoding is that it’s possible to indicate uncertainty. A word or phrase that can be read more than one way can be flagged.

    I don’t know how it will end up looking in EMMO, but theoretically, you could press a button and have all uncertain words turn purple. Alternative readings can be indicated, as can degree of certainty, and who said so (again, not sure how far EMMO would go with this, but in theory, you could press a button and see only words that Transcriber X was less than 70% sure about turn purple; the danger, of course, is that Transcriber X could spend so much time encoding every little uncertainty that it takes weeks just to do one page).

