Statistical analysis for the regression model f β(y | x, z) with missing values in the covariate vector X requires modeling of the covariate distribution g(x | z). Likelihood methods, including Ibrahim (1990 Ibrahim , J. G. ( 1990 ). Incomplete data in generalized linear models . J. Amer. Statist. Assoc. 85 : 765 – 769 .[Taylor & Francis Online], [Web of Science ®] , [Google Scholar]), Chen (2004 Chen , H. Y. (2004). Nonparametric and semiparametric models for missing covariates in parametric regression. J. Amer. Statist. Assoc. 99:1176–1189.[Taylor & Francis Online], [Web of Science ®] , [Google Scholar]), and Zhao (2005 Zhao , Y. ( 2005 ). Design and Efficient Estimation in Regression Analysis with Missing Data in Two-Phase Studies. Ph.D. thesis , University of Waterloo . [Google Scholar]), need either X or Z to be discrete. This article considers extending the likelihood methods to deal with cases where both X and Z may be continuous. We propose modeling the covariate distribution g(x | z) using a piece-wise nonparametric model, then a maximum likelihood estimate (MLE) of β can be computed following the maximum likelihood estimating procedure of Chen (2004 Chen , H. Y. (2004). Nonparametric and semiparametric models for missing covariates in parametric regression. J. Amer. Statist. Assoc. 99:1176–1189.[Taylor & Francis Online], [Web of Science ®] , [Google Scholar]) or Zhao (2005 Zhao , Y. ( 2005 ). Design and Efficient Estimation in Regression Analysis with Missing Data in Two-Phase Studies. Ph.D. thesis , University of Waterloo . [Google Scholar]). The resulting estimation method is easy to implement and the asymptotic properties of the MLE follow under certain conditions. Extensive simulation studies for different models indicate that the proposed method is acceptable for practical implementation. A real data example is used to illustrate the method.
Yingli PanZhan LiuGuangyu Song
Victoria J. CookX. Joan HuTim B. Swartz
Majid MojirsheibaniZahra Montazeri