Consistent estimation of linear regression models using matched data

About:
Economists often use matched samples, especially when dealing with earnings data where a number of missing observations need to be imputed. In this paper, we demonstrate that the ordinary least squares estimator of the linear regressionmodel using matched samples is inconsistent and has a non-standard convergence rate to its probability limit. If only a few variables are used to impute the missing data, then it is possible to correct for the bias. We propose two semiparametric bias-corrected estimators and explore their asymptotic properties. The estimators have an indirect-inference interpretation, and they attain the parametric convergence rate when the number of matching variables is no greater than four. Monte Carlo simulations confirm that the bias correction works very well in such cases.

Link to work