Course_IV.Rmd · 连享会/hdmetrics

title

header-includes

output

High-Dimension and Endogeneity

\usepackage{bbm, lmodern,amsmath,amssymb,enumitem,listings,enumerate}

html_document

```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) ``` ## Introduction This presents empirical applications of the linear instrumental variables (IV) model with many covariates $(p^x >>n)$ and many instruments $(p^z >>n)$ based on the estimators analysed in Belloni et al. (2012b) and Chernozhukov et al. (2015b). The main package in the hdm R package avalaible at https://cran.r-project.org/web/packages/hdm/index.html. In particular, we strongly encourage to read the vignette https://cran.r-project.org/web/packages/hdm/vignettes/hdm.pdf. ## Simulation Study These simulations illustrate two points: * the naive post-selection estimator suffers from a large regularization bias; * the cross-fitting estimator trades off a large bias for a smaller MSE compared to the immunized estimator that uses the whole sample. ```{r} library("ggplot2") library("gridExtra") library("MASS") library("mnormt") library(hdm) library(AER) library(car) library("Rcpp") ``` We reproduce the DGP of \cite{ChernozhukovHansenSpindler2015}: namely i.i.d observations $(Y_i,D_i,Z_i,X_i)^n_{i=1}$, where the number of controls is set to 200, the number of instruments to 150, the number of observations to 202. ```{r} ### Simulation parameters set.seed(135711) p_x = 200 ## number of controls p_z = 150 ## number of instruments n = 202 ## total sample size K = 2 # nb folds ``` $$\begin{align} Y_i = &\tau_0 D_i + X_i^{'} \beta_0 + 2 \varepsilon_i \\\\ D_i = &X_i^{T} \gamma_0 + Z_i^{'} \delta_0 + U_i\\\\ Z_i =& \Pi X_i + 0.125 \zeta_i, \end{align}$$ where $$ \left(\begin{array}{c} \varepsilon_i \\\\ u_i \\\\ \zeta_i \\\\ x_i \end{array} \right) \sim \mathcal{N} \left( 0 , \left( \begin{array}{cccc}1 & 0.6 & 0 & 0\\\\ 0.6 & 1 & 0 & 0\\\\ 0 & 0 & I_{p^{z}} & 0\\\\ 0 & 0 & 0& \Sigma \end{array} \right) \right) $$ where: * $\Sigma$ is a $p^{x} \times p^{x}$ matrix with $\Sigma_{kj} = (0.5)^{|j-k|}$ and $I_{p^{z}}$ the $p^{z} \times p^{z}$ identity matrix. ```{r} ### GENERATE DATA means 10^(-6))*1 + (rY_x$coefficients[2:(dim(x)[2]+1)]> 10^(-6))*1 sel[sel ==2] 10^(-6))*1 ## Do TSLS x_sel = x[,sel==1] z_sel = z[,sel_z==1] if(sum(sel)>0 & sum(sel_z)>0){ ivfit.lm = ivreg(y ~ d + x_sel| z_sel + x_sel) }else if (sum(sel)==0 & sum(sel_z)>0){ ivfit.lm = ivreg(y ~ d | z_sel) } se = -1) Result[1,] = -1) Result[2,] = -1) Result[3,] = -1) Result[4,] = -1) Result[5,] = -1) Result[6,]

连享会/hdmetrics

简介

发行版

贡献者

近期动态

连享会/hdmetrics .gitee-modal { width: 500px !important; }

简介

发行版

贡献者

近期动态

搜索帮助

连享会/hdmetrics