Ming-Yen ChengToshio HondaJialiang LiHeng Peng
Ultra-high dimensional longitudinal data are increasingly common and the\nanalysis is challenging both theoretically and methodologically. We offer a new\nautomatic procedure for finding a sparse semivarying coefficient model, which\nis widely accepted for longitudinal data analysis. Our proposed method first\nreduces the number of covariates to a moderate order by employing a screening\nprocedure, and then identifies both the varying and constant coefficients using\na group SCAD estimator, which is subsequently refined by accounting for the\nwithin-subject correlation. The screening procedure is based on working\nindependence and B-spline marginal models. Under weaker conditions than those\nin the literature, we show that with high probability only irrelevant variables\nwill be screened out, and the number of selected variables can be bounded by a\nmoderate order. This allows the desirable sparsity and oracle properties of the\nsubsequent structure identification step. Note that existing methods require\nsome kind of iterative screening in order to achieve this, thus they demand\nheavy computational effort and consistency is not guaranteed. The refined\nsemivarying coefficient model employs profile least squares, local linear\nsmoothing and nonparametric covariance estimation, and is semiparametric\nefficient. We also suggest ways to implement the proposed methods, and to\nselect the tuning parameters. An extensive simulation study is summarized to\ndemonstrate its finite sample performance and the yeast cell cycle data is\nanalyzed.\n
Yong NiuRiquan ZhangJicai LiuHuapeng Li
Shen ZhangPeixin ZhaoGaorong LiWangli Xu
Shucong ZhangJing PanYong Zhou