Session Proposal: Introduction to Text Mining with the HathiTrust Research Center

I’m willing to lead a session that will introduce attendees to the text mining tools and services of the HathiTrust Research Center (HTRC), the research arm of the HathiTrust Digital Library, a nonprofit consortium currently containing digital scans of nearly 14 million books.

At the HTRC, based jointly at the University of Illinois at Urbana-Champaign and Indiana University Bloomington, we seek to make this unprecedented collection accessible for scholars performing large-scale textual research, by supporting the HathiTrust Digital Library through a suite of computational tools built around creating and working with customized, user-created sub-collections.

This session will provide an overview of the functionalities of the HTRC Portal, how to create a sub-collection in the HTRC Portal and run algorithms against your collection, and if we have time, we’ll take a look at the HTRC + Bookworm tool for discovering lexical trends across a large corpus.

If you’re interested in joining this workshop, sign up for an HTRC account in advance here: analytics.hathitrust.org/

UPDATE: Click on this link for the slides from the HTRC text mining session!