title |
author |
date |
output |
Lecture 18 Treatment Effects |
Nick Huntington-Klein |
`r Sys.Date()` |
revealjs::revealjs_presentation |
theme |
transition |
self_contained |
smart |
fig_caption |
reveal_options |
solarized |
slide |
true |
true |
true |
|
|
|
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, warning=FALSE, message=FALSE)
library(tidyverse)
library(dagitty)
library(ggdag)
library(gganimate)
library(ggthemes)
library(Cairo)
library(ggpubr)
library(modelsummary)
library(rdrobust)
theme_set(theme_gray(base_size = 15))
```
## Recap
- We've gone over all sorts of ways to estimate a causal effect
- And how to tell when one is identified
- But... uh... what did we just estimate exactly?
- What even is *the* causal effect?
## Treatment Effects
- For any given treatment, there are likely to be *many* treatment effects
- Different individuals will respond to different degrees (or even directions!)
- This is called *heterogeneous treatment effects*
## Treatment Effects
- When we identify a treatment effect, what we're *estimating* is some mixture of all those individual treatment effects
- But what kind of mixture? Is it an average of all of them? An average of some of them? A weighted average? Not an average at all?
- What we get depends on *the research design itself* as well as *the estimator we use to perform that design*
## Individual Treatment Effects
- While we can't always estimate it directly, the true regression model becomes something like
$$ Y = \beta_0 + \beta_iX + \varepsilon $$
- $\beta_i$ follows its own distribution across individuals
- (and remember, this is theoretical - we'd still have those individual $\beta_i$s even with one observation per individual and no way to estimate them separately)
## Summarizing Effects
- There are methods that try to give us the whole distribution of effects (and we'll talk about some of them next time)
- But often we only get a single effect, $\hat{\beta}_1$.
- This $\hat{\beta}_1$ is some summary statistic of the $\beta_i$ distribution. But *what* summary statistic?
## Summarizing Effects
- Average treatment effect: the mean of $\beta_i$
- Conditional average treatment effect (CATE): the mean of $\beta_i$ *conditional on some value* (say, "just for men", i.e. conditional on being a man)
- Weighted average treatment effect (WTE): the weighted mean of $\beta_i$, with weights $w_i$
The latter two come in *many* flavors
## Common Conditional Average Treatment Effects
- The ATE among some demographic group
- The ATE among some specific group (conditional average treatment effect)
- The ATE just among people who were actually treated (ATT)
- The ATE just among people who were NOT actually treated (ATUT)
## Comon Weighted Average Treatment Effects
- The ATE weighted by how responsive you are to an instrument/treatment assignment (local average treatment effect)
- The ATE weighted by how much variation in treatment you have after all back doors are closed (variance-weighted)
- The ATE weighted by how commonly-represented your mix of control variables is (distribution-weighted)
## Are They Good?
- Which average you'd *want* depends on what you'd want to do with it
- Want to know how effective a treatment *was* when it was applied? Average Treatment on Treated
- Want to know how effective a treatment would be if applied to everyone/at random? Average Treatment Effect
- Want to know how effective a treatment would be if applied *just a little more broadly?* **Marginal Treatment Effect** (literally, the effect for the next person who would be treated), or, sometimes, Local Average Treatment Effect
## Are They Good?
- Different treatment effect averages aren't *wrong* but we need to pay attention to which one we're getting, or else we may apply the result incorrectly
- We don't want that!
- A result could end up representing a different group than you're really interested in
- There are technical ways of figuring out what average you get, and also intuitive ways
## Heterogeneous Effects in Action
- Let's simulate some data and see what different methods give us.
- We'll start with some basic data where the effect is already identified
- And see what we get!
## Heterogeneous Effects in Action
- The effect varies according to a normal distribution, which has mean 5 for group A and mean 7 for group B (mean = 6 overall)
- No back doors, this is basically random assignment / an experimental setting
```{r, echo = FALSE}
set.seed(1000)
```
```{r, echo = TRUE}
tb %
mutate(beta1 = case_when(
group == 'A' ~ rnorm(5000, mean = 5, sd = 2),
group == 'B' ~ rnorm(5000, mean = 7, sd = 2))) %>%
mutate(X = rnorm(5000)) %>%
mutate(Y = beta1*X + rnorm(5000))
```
## Heterogeneous Effects in Action
- We're already identified, no adjustment necessary, so let's just regress $Y$ on $X$
```{r, echo = FALSE}
m %
mutate(beta1 = case_when(
group == 'A' ~ rnorm(5000, mean = 5, sd = 2),
group == 'B' ~ rnorm(5000, mean = 7, sd = 2))) %>%
mutate(X = case_when(
group == 'A' ~ W + rnorm(5000, mean = 0, sd = 1), # SD = sqrt(sqrt(8)^2 + 1^2) = 3
group == 'B' ~ rnorm(5000, mean = 0, sd = 5))) %>%
mutate(Y = beta1*X + rnorm(5000))
```
## Heterogeneous Effects in Action
- We are already identified, so let's see what we get from a basic linear regression
```{r, echo = FALSE}
m %
group_by(group) %>%
mutate(Xvar = var(X),
Xcontrolvar = var(resid(lm(X~W))))
m3 %
mutate(beta1 = case_when(
group == 'Treated' ~ 5,
group == 'Untreated' ~ 7
)) %>%
mutate(Treatment = (group=='Treated')*(time>10)) %>%
mutate(Y = 3 + time + 3*(group == 'Treated') + beta1*Treatment + rnorm(1000))
m %
mutate(beta1 = case_when(
abs(Run-.5) < .2 ~ 1,
abs(Run-.5) >= .2 ~ 5
)) %>%
mutate(Y = Run + beta1*(Run>.5) + rnorm(1000))
m %
mutate(gamma1 = case_when(
group == 'A' ~ 0,
group == 'B' ~ 1,
group == 'C' ~ 3
)) %>%
mutate(X = gamma1*Z + W + rnorm(1000)) %>%
mutate(beta1 = case_when(
group == 'A' ~ 10,
group == 'B' ~ 5,
group == 'C' ~ 1
)) %>%
mutate(Y = beta1*X + W + rnorm(1000))
m