Working Papers
Behavior of Pooled and Joint Estimators in Probit Model with Random Coefficients and Serial Correlation (Primary Author, joint with Jeffrey Woodridge and Ying Zhu)
We propose a pooled maximum likelihood estimator (PMLE) for dealing with potential individual-specific heterogeneity and serial/temporal correlation in a Probit/Logit Mixture model (with binary outcomes). We compare the performance of our procedure with a joint (full) maximum likelihood estimator (JMLE), which is the dominant estimation method for mixture models in practice. The JMLE is more statistically efficient but computationally demanding; the implementation becomes even more difficult if one tries to model the serial correlation over time and allows the individual-specific heterogeneity to be correlated with the covariates (e.g., participation in some treatment). On the other hand, the PMLE is computationally simple, while robust to arbitrary forms of serial correlation and allows the individual-specific heterogeneity to be correlated with the covariates. Our result suggests that, in terms of the computation time, the JMLE takes 60 times longer than the PMLE (if the JMLE actually converges and in many instances, it simply fails to converge). We also find that, in actual implementation (as opposed to "in theory" ), the parameter estimates from the JMLE tend to be highly biased while the PMLE is consistent in the presence of serial correlation and individual-specific heterogeneity. However, if one focuses on a different statistics (the estimates for the Average Treatment Effect or the Average Partial Effects) instead of the parameter estimates themselves, we find that the JMLE can produce quite satisfactory estimates that are robust to serial correlation and individual-specific heterogeneity even under misspecification of the likelihood function with regard to the time series and correlation with the covariates. This result has important implications as when it comes to evaluating policy interventions, the ultimate interest usually concerns the treatment effect.
We propose a pooled maximum likelihood estimator (PMLE) for dealing with potential individual-specific heterogeneity and serial/temporal correlation in a Probit/Logit Mixture model (with binary outcomes). We compare the performance of our procedure with a joint (full) maximum likelihood estimator (JMLE), which is the dominant estimation method for mixture models in practice. The JMLE is more statistically efficient but computationally demanding; the implementation becomes even more difficult if one tries to model the serial correlation over time and allows the individual-specific heterogeneity to be correlated with the covariates (e.g., participation in some treatment). On the other hand, the PMLE is computationally simple, while robust to arbitrary forms of serial correlation and allows the individual-specific heterogeneity to be correlated with the covariates. Our result suggests that, in terms of the computation time, the JMLE takes 60 times longer than the PMLE (if the JMLE actually converges and in many instances, it simply fails to converge). We also find that, in actual implementation (as opposed to "in theory" ), the parameter estimates from the JMLE tend to be highly biased while the PMLE is consistent in the presence of serial correlation and individual-specific heterogeneity. However, if one focuses on a different statistics (the estimates for the Average Treatment Effect or the Average Partial Effects) instead of the parameter estimates themselves, we find that the JMLE can produce quite satisfactory estimates that are robust to serial correlation and individual-specific heterogeneity even under misspecification of the likelihood function with regard to the time series and correlation with the covariates. This result has important implications as when it comes to evaluating policy interventions, the ultimate interest usually concerns the treatment effect.
Parametric Identification of Multiplicative Exponential Heteroskedasticity (pdf)
Harvey (1976) first proposed multiplicative exponential heteroskedasticity in the context of a linear regression. These days it is more commonly seen in non-linear models such as a binary response model where correctly modeling the heteroskedasticity is imperative for consistent parameter estimates. However, there doesn't appear to be a formal proof of point identification for the parameters in the model. This paper presents several examples that show the conditions presumed throughout the literature are not sufficient for identification. The major contribution of this paper is to provide additional conditions and show identification for such a pervasive model.
Harvey (1976) first proposed multiplicative exponential heteroskedasticity in the context of a linear regression. These days it is more commonly seen in non-linear models such as a binary response model where correctly modeling the heteroskedasticity is imperative for consistent parameter estimates. However, there doesn't appear to be a formal proof of point identification for the parameters in the model. This paper presents several examples that show the conditions presumed throughout the literature are not sufficient for identification. The major contribution of this paper is to provide additional conditions and show identification for such a pervasive model.
Relaxing Conditional Independence in a Endogenous Semi-parametric Binary Response Model (pdf)
Expanding on the works of Rivers and Vuong (1988), Blundell and Powell (2004), and Rothe (2009) this paper presents a new flexible conditional maximum likelihood estimator that is able to address issues previously ignored in the literature. This estimator follows the standard two step control function approach to address endogeneity of a continuous random variable and is semi-parametric in the standard preliminary infinite dimensional nuisance parameter sense. Relaxing the Conditional Independence assumption that was previously used for identification, the proposed estimator is more robust in certain respects. For instance, this estimation procedure allows for parametric specification of heteroskedasticity in which the Blundell and Powell and Rothe estimators can only address in restricted forms. In addition, following the work of Kim and Petrin (working paper), the model allows for a more flexible (although parametrically specified) control function. Standard asymptotic results for the estimator are derived including consistency, -asymptotic normality, and an estimator for the asymptotic variance. Simulation results on parameter estimates, Average Partial Effects estimates, and Average Structural Function estimates are provided for two different specifications. The data generating process for the simulations model the empirical data given in Blundell and Powell (2004) and Rothe (2009) to give some economic context to the results. This paper concludes that there is a trade-off between a flexible specification and a structural interpretation in which the consequences of assuming Conditional Independence cannot be ignored.
Expanding on the works of Rivers and Vuong (1988), Blundell and Powell (2004), and Rothe (2009) this paper presents a new flexible conditional maximum likelihood estimator that is able to address issues previously ignored in the literature. This estimator follows the standard two step control function approach to address endogeneity of a continuous random variable and is semi-parametric in the standard preliminary infinite dimensional nuisance parameter sense. Relaxing the Conditional Independence assumption that was previously used for identification, the proposed estimator is more robust in certain respects. For instance, this estimation procedure allows for parametric specification of heteroskedasticity in which the Blundell and Powell and Rothe estimators can only address in restricted forms. In addition, following the work of Kim and Petrin (working paper), the model allows for a more flexible (although parametrically specified) control function. Standard asymptotic results for the estimator are derived including consistency, -asymptotic normality, and an estimator for the asymptotic variance. Simulation results on parameter estimates, Average Partial Effects estimates, and Average Structural Function estimates are provided for two different specifications. The data generating process for the simulations model the empirical data given in Blundell and Powell (2004) and Rothe (2009) to give some economic context to the results. This paper concludes that there is a trade-off between a flexible specification and a structural interpretation in which the consequences of assuming Conditional Independence cannot be ignored.