Lecture_10_Difference_in_Differences_Estimation.Rmd · 贾凯威/CausalitySlides

title

author

date

output

Lecture 10 Estimation of Difference in Differences

Nick Huntington-Klein

`r Sys.Date()`

revealjs::revealjs_presentation

theme

transition

self_contained

smart

fig_caption

reveal_options

solarized

slide

true

slideNumber
true

```{r setup, include=FALSE} knitr::opts_chunk$set(echo = FALSE, warning=FALSE, message=FALSE) library(tidyverse) library(dagitty) library(ggdag) library(gganimate) library(ggthemes) library(Cairo) library(fixest) library(modelsummary) theme_set(theme_gray(base_size = 15)) ``` ## Difference-in-Differences - The basic idea is to take fixed effects *and then compare the within variation across groups* - We have a treated group that we observe *both before and after they're treated* - And we have an untreated group - The treated and control groups probably aren't identical - there are back doors! So... we *control for group* like with fixed effects ## Today - Last time we compared means by hand - How can we get standard errors? How can we add controls? What if there are more than two groups? - We'll be going over how to implement DID in regression *and other methods* - Remember, regression is just a tool ## Two-Way Fixed Effects - We want an estimate that can take *within variation* for groups - also adjusting for time effects - and then compare that within variation across treated vs. control groups - Sounds like a job for fixed effects! ## Two-way Fixed Effects - For standard DID where treatment goes into effect at a particular time, we can estimate DID with $$ Y = \beta_i + \beta_t + \beta_1Treated + \varepsilon $$ - Where $\beta_i$ is group fixed effects, $\beta_t$ is time-period fixed effects, and $Treated$ is a binary indicator equal to 1 if you are currently being treated (in the treated group and after treatment) - $Treated = TreatedGroup\times After$ - Typically run with standard errors clusteed at the group level (why?) ## Two-way Fixed Effects - Why this works is a bit easier to see if we limit it to a "2x2" DID (two groups, two time periods) $$ Y = \beta_0 + \beta_1TreatedGroup + \beta_2After + \beta_3TreatedGroup\times After + \varepsilon $$ - $\beta_1$ is prior-period group diff, $\beta_2$ is shared time effect - $\beta_3$ is *how much bigger the $TreatedGroup$ effect gets after treatment vs. before, i.e. how much the gap grows - Difference-in-differences! ## Two-way Fixed Effects ```{r, echo = TRUE} tb % # Groups 6-10 are treated, time periods 4-6 are treated mutate(Treated = I(groups>5)*I(time>3)) %>% # True effect 5 mutate(Y = groups + time + Treated*5 + rnorm(6000)) m % filter(ky == 1) %>% # Kentucky only mutate(Treated = afchnge*highearn) m % mutate(treated = children > 0) %>% filter(year <= 1994) # use only pre-treatment data (fudging a year here so I can do polynomial) m % mutate(Treatment = treated & year >= 1992)) m2 % mutate(Treatment = treated & year >= 1993)) msummary(list(m1,m2), stars = TRUE, gof_omit = 'Lik|AIC|BIC|F|Pseudo|Adj') ``` ## Prior Trends and Placebo - Uh oh! Those are significant effects! (keeping in mind I snuck 1994 in to make the code work better which is actually post-treatment) - For both placebo tests and, especially, prior trends, we're a little less concerned with significance than *meaningful size* of the violations - After all, with enough sample size *anything* is significant - And those treatment effects are fairly tiny ## Dynamic DID - We've limited ourselves to "before" and "after" but this isn't all we have! - But that averages out the treatment across the entire "after" period. What if an effect takes time to get going? Or fades out? - We can also estimate a *dynamic effect* where we allow the effect to be different at different lengths since the treatment - This also lets us do a sort of placebo test, since we can also get effects *before* treatment, which should be zero ## Dynamic DID - Simply interact $TreatedGroup$ with binary indicators for time period, making sure that the last period before treatment is expected to show up is the reference $$ Y = \beta_0 + \beta_tTreatedGroup + \varepsilon $$ - Then, usually, plot it. **fixest** makes this easy with its `i()` interaction function ## Dynamic DID ```{r, echo = TRUE} df % mutate(treated = 1*(children > 0)) %>% mutate(year = factor(year)) m

贾凯威/CausalitySlides

简介

发行版

贡献者

近期动态

贾凯威/CausalitySlides .gitee-modal { width: 500px !important; }

简介

发行版

贡献者

近期动态

搜索帮助

贾凯威/CausalitySlides