Table of Contents
(First Draft)
What is this and who is it for?
About me
What other resources are there already out there?
Preface: You could, but why would you?
Research Design
Model Thinking (i.e. Think Small)
- Overview
- Defining your question
- Conceptualizing your problem
- Selecting your data (thinking about sample bias)
- Implementing your model
- Validating your model
- Discuss, discuss, discuss
Tools and Techniques
Data and code for this book
Working with metadata (hello Tables!)
Working with text data: The big picture
Data Preparation
- Reading in your data using the TM library
- Reading in your data one-by-one using custom functions
- Reading in your data using BookNLP
Feature Selection and Construction
- Introducing the idea of a feature space
- Zipf’s Law and stopping to think about stopwords
- Dictionary-based approaches, including sentiment analysis
- Topic modeling
- What is topic modeling?
- What are some of the problems of topic modeling?
- Chunking Texts
- Preparing your data
- Setting parameters of the model
- Inspecting your model: Overview
- Inspecting your model: Words
- Inspecting your model: Topics
- Assessing individual topics: Some initial diagnostics
- Assessing Topic Stability
- Working with BookNLP Data
- Part of speech analysis
- Named entities (People and Places)
- Difficulty and other readability metrics
- Social network analysis
- Advanced feature detection using machine learning
Understanding Data
Exploring
- Clustering and other unsupervised learning approaches
- Visualizing your data (learning ggplot)
Explaining
- Corpus comparison and hypothesis testing
- Comparing a single feature across multiple groups
- Multivariate regression
Predicting
- Predictive modeling and machine learning
Applications
How to think like a data-driven humanist
- Genre, or all the ways documents fall into meaningful groups
- Bias and the Social Construction of Everything
- Prestige and Social Capital
- Narrative, the most important technology ever invented
- People, or why we should call them entities
- Space, where am I?
- Readers, what are they doing?
- Careers, or the physiology of writing and creating
- Institutions, the thing we often overlook but are actually incredibly powerful shapers of our everyday reality and also supremely good at maintaining unfairness, among other things