The Collation

Research and Exploration at the Folger

Folger Tooltips: Getting raw Hamnet data

Non-librarians out there, have you ever clicked the “MARC View” or “Staff view” link in an online catalog record? In Hamnet, the Folger’s online catalog, it’s the third choice at the top of each record. Image of MARC View button I vividly remember the first time I did. It was back when I was building a relational database to manage my dissertation research (and back when I thought I wanted to be an Art History professor). I’d been carefully recording information about early modern printers’ names, printing locations, dates, and subject matter in different fields in my database, and was incensed to discover that the information had already been broken down in the library catalog, but “they” were hiding it from me! What displayed as:

London : Printed for Robt. Sayer, 1753.

was, underneath:

‡a London : ‡b Printed for Robt. Sayer, ‡c 1753.

Three tidy chunks of information. Or so I thought. I soon learned that transcription rules made the imprint field tricky, but that still left plenty of controlled vocabulary, standardized codes, and normalized information elsewhere in the record, if you know where to look. The Library of Congress’s Understanding MARC Bibliographic provides a useful, if somewhat dated, overview (for instance, the “recently approved changes” mentioned in Part III are about 15 years old now). In order to actually do anything with the MARC data, of course, you’ll need to get it out of Hamnet. This can be done using the “Print/Save” button at the bottom of your search results. For example, if you want all the Hamnet records for material formerly owned by 18th-century bibliographer William Herbert:

  • Do a search that brings up all the records you want (in this case, I did a name browse on “Herbert, William” and clicked the link that has him as “former owner”)
  • If your search brings up more than 50 records (the default) you’ll need to force the system to display all results on one page.

Search results: displaying 1 through 50 of 208 entries

  • To display all results on one page, go to the address window in your browser and navigate through the long string of characters. Somewhere in there, you’ll find “CNT=50” (which is what limits the results count to 50 per page). Change “50” to a number greater than or equal to the number of results.  Press “Enter” and wait for the screen to re-draw (speed depends on how many entries it needs to pull in)

CNT=50 changed to CNT=500

  • At the bottom of the screen, select “All on page” and “Raw MARC” then click “Print/Save”

Save as Raw MARC dialog

  • Go to the saved file and change “.cgi” to “.mrc” (not strictly necessary, since it already is a MARC file, but “.mrc” is the file extension expected by MarcEdit, one of the most popular free tools for working with MARC records).

Congratulations! You now have your own file of MARC records. Unfortunately, raw MARC isn’t particularly useful. Image of block of undifferentiated numbers and letters You need to use a program like Terry Reese’s free MarcEdit to convert it into something else, such as slightly more eye-readable text: Image of slightly differentiated numbers and words Or machine-actionable XML: Image of sample MARC XML Or a tab-delimited file containing just the field(s) you want to target, which you can then import into a spreadsheet, like this table of publication dates for Folger titles owned by Herbert. 1 [UPDATE 19 July 2015: I now know it would have been much easier to export the data from the 008 as one string, then use Excel’s text-to-column fixed width conversion tool. — Erin]Image of Excel data showing 16th-century dates I’ll go into using MarcEdit in a follow-up Tooltip. Meanwhile, for an explanation of what the different MARC codes mean, I’d recommend MARC 21 Format  for Bibliographic Data (Library of Congress Network Development and MARC Standards Office) and Bibliographic Formats and Standards (OCLC Support & Training).

  1. The four characters making up each year still need to be concatenated in Excel; this is just the straight “import” from tab-separated values.

Leave a Reply

  • (will not be published)