Lecture_19_Causal_Midterm_Review.Rmd · 贾凯威/CausalitySlides

title

author

date

output

Lecture 19 Treatment Effect Methods and Review

Nick Huntington-Klein

`r Sys.Date()`

revealjs::revealjs_presentation

theme

transition

self_contained

smart

fig_caption

reveal_options

solarized

slide

true

slideNumber
true

```{r setup, include=FALSE} knitr::opts_chunk$set(echo = FALSE, warning=FALSE, message=FALSE) library(tidyverse) library(dagitty) library(ggdag) library(gganimate) library(ggthemes) library(ggpubr) library(modelsummary) library(fixest) library(Cairo) theme_set(theme_gray(base_size = 15)) ``` ## Some Pointers - Last time we talked about heterogeneous treatment effects and how our methods produce different averages of those effects - But we don't need to be limited to that! - There are plenty of methods - many of them new - that let us estimate a *distribution* of treatment effects - We won't be going super far into detail with them, but I'll mostly just be letting you know they exist and some pointers for looking further - I'll favor pointers to packages over papers, but if you look in the help files you'll generally find paper citations ## Sorted Effects - The *sorted effects* method uses covariates to look at variation in the treatment effect, and produces a distribution of treatment effects - It also lets you see *who* is at each part of the distribution ```{r, echo = TRUE, results='hide'} library(SortedEffects) # Data on being denied for a mortgage. data(mortgage) # Save the formula to reuse later fm % as_tibble() %>% mutate(Group = row.names(summary(classify)), Ratio = Most/Least) %>% select(Group, Most, Least, Ratio) %>% arrange(Ratio) ``` ## Sorted Effects - Those who were denied for insurance (`denpmi`) had smallest effects of `black`, those who were `single` had the biggest ```{r, echo = FALSE} knitr::kable(results) ``` ## Bayesian Hierarchical Modeling - A very old method! But it works. An extension of random effects - Instead of just letting the *constant* vary, let *any* coefficient vary, and give each its own function to vary over controls! Those controls can let the effect vary $$ Y = \beta_0 + \beta_1X + \varepsilon $$ $$ \beta_0 = \gamma_{00} + \nu_{00} $$ $$ \beta_1 = \gamma_{10} + \gamma_{11}W + \nu_{01} $$ - Terminology difference: "fixed effects" means "coefficients that don't vary" ## Bayesian Hierarchical Modeling ```{r, echo = TRUE} library(lme4) # The whole thing would be super slow so for now let's just do a few effects m % mutate(holdout = runif(n()) > .5) holdout % filter(holdout) training % filter(!holdout) W = training %>% pull(black) %>% as.matrix() X = training %>% select(p_irat, hse_inc, ccred, mcred, pubrec, denpmi, selfemp, single, hischl, ltv_med, ltv_high) %>% as.matrix() Y = training %>% pull(deny) %>% as.matrix() m % select(p_irat, hse_inc, ccred, mcred, pubrec, denpmi, selfemp, single, hischl, ltv_med, ltv_high) %>% as.matrix() indiv_effects % mutate(effect = indiv_effects$predictions) ``` ## Causal Forest ```{r, echo = FALSE} ggplot(as_tibble(holdout), aes(x = effect)) + geom_density() + theme_pubr() + labs(x = 'Individual Effect of Black on Denial Probability', y = 'Density') ``` ## Causal Forest - Who is affected? Let's do a similar test to what **SortedEffects** did (although we could look at it plenty of other ways) ```{r} holdout %>% mutate(Range = case_when( effect <= quantile(effect, .05) ~ 'Bottom', effect >= quantile(effect, .95) ~ 'Top', TRUE ~ NA_character_ )) %>% filter(!is.na(Range)) %>% group_by(Range) %>% select(Range, p_irat, black, hse_inc, ccred, mcred, pubrec, denpmi, selfemp, single, hischl, ltv_med, ltv_high) %>% summarize(across(.fns = mean)) %>% pivot_longer(cols = 2:13) %>% pivot_wider(id_cols = name, values_from = value, names_from = Range) %>% mutate(Ratio = Top/Bottom) %>% arrange(Ratio) %>% knitr::kable() ``` ## Treatment Effect Methods - Anyway, there's some stuff for you to check out! - Obviously there are zillions of causal-inference methods we don't have time to cover - Bartik instruments, matrix completion, causal discovery, and so on and so on and so on - Consider this a good starting place ## Exam Review - Just a reminder of some stuff we've covered ## Fixed Effects - If we have data where we observe the same people over and over, we can implement *fixed effects* by controlling for *individual* - This accounts for everything that's constant within individual. If, for example, "individual" was city, that would include geography, state, founding year, etc. - Doesn't account for things that vary within individual over time, like `Laws` ## Difference-in-Difference - Difference-in-Difference applies when you have a group that you can observe both before and after the policy - You worry that `time` is a confounder, but you can't control for it - Unless you add a control group that DIDN'T get the policy - We must be careful to check that parallel trends holds ## Difference-in-Difference ```{r, dev='CairoPNG', echo=FALSE, fig.width=7, fig.height=5.5} dag % tidy_dagitty() ggdag_classic(dag,node_size=20) + theme_dag_blank() ``` ## Difference-in-Difference - Get the before-after difference for both groups - Then subtract out the difference for the control ```{r, echo=TRUE} diddata % mutate(Treated = (Group == "T") & Time == "After") %>% mutate(Y = 2*(Group == "T") + 1.5*(Time == "After") + 3*Treated + rnorm(5000)) did % group_by(Group,Time) %>% summarize(Y = mean(Y)) before.after.control % tidy_dagitty() ggdag_classic(dag,node_size=20) + theme_dag_blank() ``` ## Regression Discontinuity - Estimate by fitting a line that jumps at the cutoff and estimating the jump - Use local regression and bandwidths to avoid being affected by far-away observations - "Fuzzy" designs where treatment only jumps partially scale the effect using IV ## Regression Discontinuity - Expressed well in graphs! Treatment should jump at cutoff. If not perfectly from 0% to 100%, use IV too ```{r, dev='CairoPNG', echo=FALSE, fig.width=7, fig.height=5} rdddata % mutate(run = runif(10000)+.03*W) %>% mutate(treated = run >= .6) %>% mutate(Y = 2+.01*run+.5*treated+W+rnorm(10000)) ggplot(rdddata,aes(x=run,y=Y,color=treated)) + geom_point()+ geom_vline(aes(xintercept=.6))+ geom_smooth(aes(group = treated), size = 2, color = 'black') + labs(x='Running Variable', y='Outcome') + theme_pubr() + guides(color = FALSE) ``` ## Regression Discontinuity - Variables other than `Y` and treatment shouldn't jump at cutoff - they should be balanced ```{r, dev='CairoPNG', echo=FALSE, fig.width=7, fig.height=5} ggplot(rdddata,aes(x=run,y=W,color=treated)) + geom_point()+ geom_vline(aes(xintercept=.6))+ geom_smooth(aes(group = treated), size = 2, color = 'black') + labs(x='Running Variable', y='W') + theme_pubr() + guides(color = FALSE) ``` ## Instrumental Variables - An instrumental variable affects treatment (relevant) but has no back doors itself or paths to $Y$ except through $X$ (valid) - We move the no-open-back-doors assumption to the IV rather than the treatment - We isolate JUST the variation that comes from `Z`. No back doors in that variation! We have a causal effect - Can conceptually think of it as (or literally apply it to) an experiment where randomization doesn't work perfectly ## Instrumental Variables ```{r, dev='CairoPNG', echo=FALSE, fig.width=7, fig.height=5.5} dag % tidy_dagitty() ggdag_classic(dag,node_size=20) + theme_dag_blank() ``` ## Treatment Effects - There isn't *a* treatment effect. They vary across time, space, individual - Our methods give us averages - ATE (experiment), ATT (DID), LATE (IV, RDD), variance-weighted (regression w/ controls), etc. - We must pay close attention to what our design *and estimator* gives us ## That's it! - In a very condensed way, that's the material we covered! - I recommend looking back over slides, notes, homeworks

贾凯威/CausalitySlides

简介

发行版

贡献者

近期动态

贾凯威/CausalitySlides .gitee-modal { width: 500px !important; }

简介

发行版

贡献者

近期动态

搜索帮助

贾凯威/CausalitySlides