JOURNAL ARTICLE

VARIABLE SELECTION FOR REGRESSION MODELS WITH MISSING DATA

Abstract

We consider the variable selection problem for a class of statistical models with missing data, including missing covariate and/or response data. We investigate the smoothly clipped absolute deviation penalty (SCAD) and adaptive LASSO and propose a unified model selection and estimation procedure for use in the presence of missing data. We develop a computationally attractive algorithm for simultaneously optimizing the penalized likelihood function and estimating the penalty parameters. Particularly, we propose to use a model selection criterion, called the ICQ statistic, for selecting the penalty parameters. We show that the variable selection procedure based on ICQ automatically and consistently selects the important covariates and leads to efficient estimates with oracle properties. The methodology is very general and can be applied to numerous situations involving missing data, from covariates missing at random in arbitrary regression models to nonignorably missing longitudinal responses and/or covariates. Simulations are given to demonstrate the methodology and examine the finite sample performance of the variable selection procedures. Melanoma data from a cancer clinical trial is presented to illustrate the proposed methodology.

Keywords:
Missing data Feature selection Regression Selection (genetic algorithm) Statistics Variable (mathematics) Regression analysis Computer science Econometrics Mathematics Artificial intelligence

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.31
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Neural Networks and Applications
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Data Processing Techniques
Physical Sciences →  Engineering →  Control and Systems Engineering

Related Documents

JOURNAL ARTICLE

Variable selection for models with missing data

Ramon I. Garcia

Journal:   Carolina Digital Repository (University of North Carolina at Chapel Hill) Year: 2019
JOURNAL ARTICLE

Automated Bayesian variable selection methods for binary regression models with missing covariate data

Michael BergrabChristian Aßmann

Journal:   AStA Wirtschafts- und Sozialstatistisches Archiv Year: 2024 Vol: 18 (2)Pages: 203-244
JOURNAL ARTICLE

High-dimensional variable selection in regression and classification with missing data

Qi GaoThomas C. M. Lee

Journal:   Signal Processing Year: 2016 Vol: 131 Pages: 1-7
© 2026 ScienceGate Book Chapters — All rights reserved.