Lecture_13_Estimating_Regression_Discontinuity.Rmd · 计量经济学/CausalitySlides

title

author

date

output

Lecture 13 Estimating Regression Discontinuity

Nick Huntington-Klein

`r Sys.Date()`

revealjs::revealjs_presentation

theme

transition

self_contained

smart

fig_caption

reveal_options

solarized

slide

true

slideNumber
true

```{r setup, include=FALSE} knitr::opts_chunk$set(echo = FALSE, warning=FALSE, message=FALSE) library(tidyverse) library(dagitty) library(ggdag) library(ggpubr) library(ggthemes) library(Cairo) library(rdrobust) library(modelsummary) library(purrr) library(AER) theme_set(theme_gray(base_size = 15)) ``` ## Recap - Regression discontinuity is a design that can be used when treatment is applied based on a *cutoff* - Above the cutoff? Treated! Below the cutoff? Not treated! (or below/above) - By comparing people right around the cutoff, we are effectively closing all back doors - Isolating the treatment effect! ## Today - Surely we aren't just comparing averages above and below! - How can we actually implement regression discontinuity (presumably with regression?) - What do we need to keep in mind? - What about close variations? What if the cutoff doesn't assign treatment perfectly? ## Regression Discontinuity in Regression - How can re make a model for RDD? - We want to: look for a jump at a cutoff point - Get as good an idea of what the outcome is just on either side of the cutoff - So... ## Regression Discontinuity in Regression Let's start with the simple linear version: $$ Y = \beta_0 + \beta_1(X-Cutoff) + \beta_2Treated + $$ $$ \beta_3Treated\times(X-Cutoff)+\varepsilon $$ - This formulation basically allows there to be two lines: one to the left of the cutoff ( $\beta_0 + \beta_1(X-Cutoff)$ ), and one to the right ( $(\beta_0 + \beta_2) + (\beta_1 + \beta_3)(X-Cutoff)$ ) - The jump at the cutoff is given by $\beta_2$ - that's our RDD estimate - We use $X$ *relative to the cutoff* so that we can easily locate the jump in the $\beta_2$ coefficient ## Regression Discontinuity in Regression ```{r} set.seed(1000) tb % mutate(Y = X + .5*(X > .5) + .2*rnorm(100)) ggplot(tb, aes(x = X, y = Y, group = X > .5)) + geom_point() + geom_smooth(method = 'lm', se = FALSE) + geom_vline(aes(xintercept = .5)) + ggpubr::theme_pubr() ``` ## Choices! - This is of course the simplest version! - Things to consider: - Bandwidth - Functional form - Controls ## Bandwidth - The idea of RDD is that people *just around the cutoff* are very much comparable - Basically random if your test score is 79 vs. 81 if the cutoff is 80, for example - So people far away from the cutoff aren't too informative! At best they help determine the slope of the fitted lines - So... drop 'em! ## Bandwidth - RDD generally uses data only from the observations in a given range around the cutoff - (Or at least weights them less the further away they are from cutoff) - How wide should the bandwidth be? - There's a big wide literature on *optimal bandwidth selection* which balances the addition of bias (from adding people far away from the cutoff who may have back doors) vs. variance (from adding more people so as to improve estimator precision) - We won't be doing this by hand, we can often rely on an RDD command to do this for us ## Functional Form - Why fit a straight line on either side? If the true relationship is curvy this will give us the wrong result! - We can be much more flexible! As long as we fit some sort of line on either side, we can look for the jump - One way to do this is with polynomials ( $\tilde{X} = X-Cutoff$, $T = Treated$ ): $$ Y = \beta_0 + \beta_1\tilde{X} + + \beta_2 \tilde{X}^2 + \beta_3T + \beta_4\tilde{X}T + + \beta_5 \tilde{X}^2T+\varepsilon $$ ## Functional Form - (by the way, you can take this basic interaction-with-cutoff design idea and use it to look at how *anything* changes before and after cutoff, not just the level of $Y$! You could look at how the *slope* changes ("regression kink"), or how some other identified effect changes, or just about anything! The beauty of flexible design) ## Functional Form - The interpretation is the same as before - look for the jump! - We do want to be careful with polynomials though, and not add too many - Remember, the more polynomial terms we add, the stranger the behavior of the line at *either end* of the range of data - And the cutoff is at the far-right end of the pre-cutoff data and the far-left end of the post-cutoff data! - So we can get illusory effects generated by having too many terms ## Functional Form - A common approach is to use *non-parametric* regression or *local linear regression* - This doesn't impose any particular shape! And it's easy to get a prediction on either side of the cutoff - This allows for non-straight lines without dealing with the issues polynomials bring us ## Different Functional Forms - Let's look at the same data with a few different functional forms - Remember, the RDD effect is the jump at the cutoff. The TRUE effect here will be $.3$, and the TRUE model is an order-2 polynomial ```{r, echo = FALSE} set.seed(500) ``` ```{r, echo = TRUE} tb % mutate(Y = 1.5*Running - .6*Running^2 + .3*(Running > .5) + rnorm(200, 0, .25)) %>% mutate(RC = Running - .5, Treated = Running > .5) ``` ## Different Functional Forms ```{r, echo =FALSE} m % group_by(Treated) %>% mutate(YM=mean(Y)) %>% pull(YM)), color = 'blue') + theme_pubr() + labs(x = 'Running Variable Centered on Cutoff', y = 'Outcome', title = paste0('Simple Above/Below Average. Jump: ', scales::number(jump, accuracy = .001))) ``` ## Different Functional Forms ```{r, echo =FALSE} m % arrange(RC) m1 % filter(!Treated)) m2 % filter(Treated)) jump squared doesn't always remain true) - (And fewer terms makes more sense too once we apply a bandwidth and zoom in) - Be very suspicious if your fit veers wildly off right aroud the cutoff - Consider a nonparametric approach ## Controls - Generally you don't need control variables in an RDD - If the design is valid, you've closed all back doors. That's sort of the whole point! - Although maybe we want some if we have a wide bandwidth - this will remove some of the bias - Still, we can get real value from having access to control variables. How? ## Controls - Control variables allow us to perform *placebo tests* of our RDD model - RDD should close all back doors... but what if it doesn't? What if we missed something - We can rerun our RDD model, but simply use a control variable as the outcome - If we find an effect... uh oh, that shouldn't happen! (outside of the levels expected by normal sampling variation) - You can run these for *every control variable you have!* ## Fuzzy Regression Discontinuity - Commonly, treatment isn't *entirely* assigned on the basis of a cutoff - But it becomes much *more/less* common at the cutoff - We can still work with this! - This is called *fuzzy regression discontinuity* ## Fuzzy Regression Discontinuity - We can start by making sure there's actually a jump in treatment at the cutoff, by running RDD with treatment as the outcome - There has to at least be a jump (up or down) in treatment probability at the cutoff - If there isn't (or if there is but it's tiny - we'll be dividing by this later and don't want to divide by something close-to-zero) that's a problem! - Statistically won't work, and theoretically implies we were wrong about our RDD design ## Fuzzy Regression Discontinuity ```{r, echo = FALSE} set.seed(10000) fuzz % mutate(Treat = (.1 + .5*Running + .5*(Running > .5)) %>% map_dbl(function(x) min(x, 1)) %>% map_dbl(function(x) sample(c(1,0), 1, prob = c(x, 1-x)))) %>% mutate(Y = 1 + Running + 2*Treat + rnorm(150)*.5) %>% mutate(Runbin = cut(Running, 0:10/10)) %>% group_by(Runbin) %>% mutate(av_treat = mean(Treat), av_out = mean(Y)) ggplot(fuzz , aes(x = Running, y = Treat)) + geom_point() + geom_point(data = fuzz %>% group_by(Runbin) %>% slice(1), aes(x = Running, y = av_treat), color = 'red', size = 2) + geom_smooth(aes(group = Running > .5), method = 'lm', color = 'blue', se = FALSE) + geom_vline(aes(xintercept = .5), linetype = 'dashed') + ggpubr::theme_pubr() + labs(x = 'Running Variable', y = 'Treated') ``` ## Fuzzy Regression Discontinuity - So what happens if we just do RDD as normal? - The effect is understated because we have some untreated in the post-cutoff and treated in the pre. - So with a positive effect the pre-cutoff value goes up (because we mix some treatment effect in there) and the post-cutoff value goes down (since we mix some untreated in there), bringing them closer together and shrinking the effect estimate ## Fuzzy Regression Discontinuity ```{r, echo = FALSE} ggplot(fuzz , aes(x = Running, y = Y)) + geom_point() + geom_point(data = fuzz %>% group_by(Runbin) %>% slice(1), aes(x = Running, y = av_out), color = 'red', size = 2) + geom_smooth(aes(group = Running > .5), method = 'lm', color = 'blue', se = FALSE) + geom_vline(aes(xintercept = .5), linetype = 'dashed') + ggpubr::theme_pubr() + labs(x = 'Running Variable', y = 'Treated') ``` ## Fuzzy Regression Discontinuity - This is simulated data, the true effect is 2. ```{r, echo = FALSE} fuzz % mutate(Above = Running >= .5) mreg

计量经济学/CausalitySlides

简介

发行版

贡献者

近期动态

计量经济学/CausalitySlides .gitee-modal { width: 500px !important; }

简介

发行版

贡献者

近期动态

搜索帮助

计量经济学/CausalitySlides