Dietmar ZellnerFrieder KellerGünter E. Zellner
Abstract In the statistical analysis the selection of independent (predictor) variables in a regression model that might influence the outcome variable is an important task. To overcome the problems with selection procedures to obtain these authentic variables, we compare the performance of stepwise selection procedures with a bagging method proposed by Sauerbrei [Sauerbrei, W. (1999). The use of resampling methods to simplify regression models in medical statistics. Appl. Statist. 48:313–329]. Furthermore, the bootstrap method with a variable selection from the full logistic regression model was applied. Logistic regression models were conducted to compare the performance of these selection procedures. Similar results were obtained for the different selection procedures such as backward, forward or stepwise selection with the same entry/retention criterion for the "simple" and the bagging method, respectively. Our simulations show better results for small entry and/or retention criterion, in particularly when the predictor variables were correlated. The bagging procedures were substantial better than the "simple" stepwise selection procedures. However, the problems remain, for instance that the degree of correlation between the predictor variables affects the frequency with which authentic variables found their way into the final model.
Yiqing TianHoward D. BondellAlyson G. Wilson
Shangli ZhangLili ZhangKuan-Min QiuYing LüBaigen Cai
Ming‐Hui ChenJoseph G. IbrahimConstantin T. Yiannoutsos