The Fish and the Painting

Table of Contents

(First Draft)

What is this and who is it for?

About me

What other resources are there already out there?

Preface: You could, but why would you?

Research Design

Model Thinking (i.e. Think Small)

Tools and Techniques

Data and code for this book

Working with metadata (hello Tables!)

Working with text data: The big picture

Data Preparation

Feature Selection and Construction

Understanding Data


  • Clustering and other unsupervised learning approaches
  • Visualizing your data (learning ggplot)


  • Corpus comparison and hypothesis testing
  • Comparing a single feature across multiple groups
  • Multivariate regression


  • Predictive modeling and machine learning


How to think like a data-driven humanist

  • Genre, or all the ways documents fall into meaningful groups
  • Bias and the Social Construction of Everything
  • Prestige and Social Capital
  • Narrative, the most important technology ever invented
  • People, or why we should call them entities
  • Space, where am I?
  • Readers, what are they doing?
  • Careers, or the physiology of writing and creating
  • Institutions, the thing we often overlook but are actually incredibly powerful shapers of our everyday reality and also supremely good at maintaining unfairness, among other things