Lecture_17_Case_Methods.Rmd · 贾凯威/CausalitySlides

title

author

date

output

Lecture 17 Case Methods

Nick Huntington-Klein

`r Sys.Date()`

revealjs::revealjs_presentation

theme

transition

self_contained

smart

fig_caption

reveal_options

solarized

slide

true

slideNumber
true

```{r setup, include=FALSE} knitr::opts_chunk$set(echo = FALSE, warning=FALSE, message=FALSE) library(tidyverse) library(dagitty) library(ggdag) library(gganimate) library(ggthemes) library(Cairo) library(ggpubr) theme_set(theme_gray(base_size = 15)) ``` ## Recap - So far, the method we've used have relied on access to a bunch of treated individuals, so we can average over them to get an idea of the outcome conditional on treatment - But many treatments we might be interested in are only applied to *one* group in *one case* - What can we do? - We can use *case methods* ## The Problem - Why is this likely to be a special problem? - For one, we have very little data! Only one treated group means that you only have a handful of pre- and post-treatment observations - It's harder to believe we can abstract away "what's different" about that group when we can't average over a few groups - Can't control for group-specific stuff - Can aggregate multiple case effect estimates ## Approaches - Event studies (the case alone) - Synthetic control (the case vs. control groups) ## Event Studies - In an event study, you are really asking the question "what changed when treatment went into effect?" - At its core it's just a before/after comparison - With some bells and whistles ## The Basic Problem - Our diagram looks like difference-in-differences wbut without the control group - If anything else is changing over time, we have a back door! ```{r, dev='CairoPNG', echo=FALSE, fig.width=6,fig.height=3.75} dag % tidy_dagitty() ggdag_classic(dag,node_size=20) + theme_dag_blank() ``` ## Hmm... - Our actual goal in identifying an effect using before/after data is to figure out *what after would have looked like* for the treated group if no treatment had occurred - DID says "let's see how a different group changed and assume the treated group would have changed in the same way" - Event studies say "let's see how the treated group was changing before and use that to predict how it would have continued to change" ## Another way to think about it - Rather than thinking of event studies as being like DID but without a control group, we can think of them as a RDD but with time as a running variable, and a cutoff when treatment was introduced - One simple form of event study estimation actually uses the same regression equation $$ Y = \beta_0 + \beta_1Time + \beta_2After + \beta_3Time\times After $$ with $\beta_2$ as the event study estimate ## The Time Issue - RDD works pretty good, so event studies seem pretty solid too, right? Ehhh... - The assumption that nothing else changes at the cutoff is a bit harder to believe for time as a running variable. Things change over time! - Some studies, especially in high-frequency finance data, make this more believable by looking at *really tiny time intervals* - If you have second-to-second data, and your bandwidth is like 10 minutes on either side, then yeah, probably nothing else changed at the cutoff - Of course this also requires that the effect of treatment has to kick in real quick! ## Forecasting - More often in areas where event studies are used to identify causal effects (and not just, say, check the plausibility of a DID design), they use *forecasting* tools - We want to predict the counterfactual after treatment as if no treatment had occurred - So... let's look at the trend before treatment and assume that continues! - (a) estimate a time-series model using pre-treatment data, (b) forecast post-treatment data, (c) compare to outcomes ## Forecasting - Plus, we know how to calculate confidence intervals for forecasts, so that's an easy way to see if the outcomes we observe are statistically significant - Doing this properly requires effectively using forecasting tools in time series analysis, which is not something I'm going to delve into super deeply in this class - So let's just see an simple example simulation and an example study ## Simulation ```{r, echo = TRUE} set.seed(1000) tb % mutate(After = Time >= 80) %>% # Increasing overall trend, plus treatment effect mutate(Y = .1*Time - .001*(Time-20)^2 + 4*After + rnorm(100)) # Add an AR(1) process to the data for (i in 2:100) { tb$Y[i] % filter(!After) %>% model(AR(Y ~ Time + I(Time^2) + order(1))) predictions % filter(!After), h = 20) effect % filter(After) %>% pull(Y)) - mean(predictions$.mean) effect ``` ## Simulation - Clearly the time-series modeling could already use some work here but you get the idea! ```{r, echo = FALSE} tbts

贾凯威/CausalitySlides

简介

发行版

贡献者

近期动态

贾凯威/CausalitySlides .gitee-modal { width: 500px !important; }

简介

发行版

贡献者

近期动态

搜索帮助

贾凯威/CausalitySlides