JOURNAL ARTICLE

Imputation and Variable Selection in Linear Regression Models with Missing Covariates

Xiaowei YangThomas R. BelinW. John Boscardin

Year: 2005 Journal:   Biometrics Vol: 61 (2)Pages: 498-506   Publisher: Oxford University Press

Abstract

Summary Across multiply imputed data sets, variable selection methods such as stepwise regression and other criterion‐based strategies that include or exclude particular variables typically result in models with different selected predictors, thus presenting a problem for combining the results from separate complete‐data analyses. Here, drawing on a Bayesian framework, we propose two alternative strategies to address the problem of choosing among linear regression models when there are missing covariates. One approach, which we call “impute, then select” (ITS) involves initially performing multiple imputation and then applying Bayesian variable selection to the multiply imputed data sets. A second strategy is to conduct Bayesian variable selection and missing data imputation simultaneously within one Gibbs sampling process, which we call “simultaneously impute and select” (SIAS). The methods are implemented and evaluated using the Bayesian procedure known as stochastic search variable selection for multivariate normal data sets, but both strategies offer general frameworks within which different Bayesian variable selection algorithms could be used for other types of data sets. A study of mental health services utilization among children in foster care programs is used to illustrate the techniques. Simulation studies show that both ITS and SIAS outperform complete‐case analysis with stepwise variable selection and that SIAS slightly outperforms ITS.

Keywords:
Covariate Imputation (statistics) Missing data Statistics Linear regression Feature selection Regression Regression analysis Econometrics Computer science Mathematics Artificial intelligence

Metrics

74
Cited By
2.79
FWCI (Field Weighted Citation Impact)
52
Refs
0.90
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Statistical Methods and Inference
Physical Sciences →  Mathematics →  Statistics and Probability
Bayesian Methods and Mixture Models
Physical Sciences →  Computer Science →  Artificial Intelligence
Statistical Methods and Bayesian Inference
Physical Sciences →  Mathematics →  Statistics and Probability
© 2026 ScienceGate Book Chapters — All rights reserved.