DS-HECK: double-lasso estimation of Heckman selection model

Abstract: We extend the Heckman (1979) sample selection model by allowing for a large number of controls that are selected using lasso under a sparsity scenario. The standard lasso estimation is known to under-select causing an omitted variable bias in addition to the sample selection bias. We outline the required adjustments needed to restore consistency of lasso-based estimation and inference for vector-valued parameters of interest in such models. The adjustments include double lasso for both the selection equation and main equation and a correction of the variance matrix. We also connect the estimator with results on redundancy of moment conditions. We demonstrate the effect of the adjustments using simulations and we investigate the determinants of female labor market participation and earnings in the US using the new approach. The paper comes with dsheckman, a dedicated Stata command for estimating double-selection Heckman models.

