Text Encoding & Analysis

quantifying words

We provide consultations about text analysis, a computational approach to studying texts. Sometimes referred to as text mining or distant reading, it relies on a suite of tools to quantify textual elements in order to identify meaningful patterns. Some areas of text analysis include:

  • Word frequency: Examining prominence of terms across one or several texts
  • Parts of speech: Quantifying grammatical distinctions
  • Named-entity recognition: Detection of proper nouns
  • Sentiment analysis: Classifying texts as positive or negative in tone
  • Topic modeling: Identifying themes across a body of texts

We can assist you in conducting text analysis by developing testable hypotheses, formatting text for machine analysis, and getting started with text analysis tools.

Text Encoding & Analysis Projects

Southern Changes

Southern Changes is a journal published by the Southern Regional Council between 1978 and 2003. ECDS maintains a digital archive of the TEI-encoded journal that provides a thematic browsing page based on topic modeling, a form of quantitative text analysis.

The Complete Prose of T.S. Eliot: The Critical Edition

The Complete Prose of T.S. Eliot: The Critical Edition is an 8-volume annotated publication featuring all extant published and unpublished prose of the poet T.S. Eliot. Edited by a team of scholars led by Emory professor emeritus Dr. Ronald Schuchard, the text of the edition was encoded for web publication by ECDS.

The Interactive Index to the Letters of Samuel Beckett

The Interactive Index to the Letters of Samuel Beckett is a collaboration between ECDS and the The Letters of Samuel Beckett project, a decades-long project led by Dr. Lois Overbeck that seeks to collect, consult and transcribe all the extant letters of the Irish writer Samuel Beckett. ECDS assisted with building a dataset and website that allows users to explore an index of the people, places, events and works referenced in the letters.