Zhaoyang LiYahang LiuKecheng WeiYongfu YuGuoyou QinZhongyi Zhu
Abstract Motivation A critical challenge in observational studies arises from the presence of hidden confounders in high-dimensional data. This leads to biases in causal effect estimation due to both hidden confounding and high-dimensional estimation. Some classical deconfounding methods are inadequate for high-dimensional scenarios and typically require prior information on hidden confounders. We propose a two-step deconfounded and debiased estimation for high-dimensional linear regression with hidden confounding. Results First, we reduce hidden confounding via spectral transformation. Second, we correct bias from the weighted ℓ1 penalty, commonly used in high-dimensional estimation, by inverting the Karush–Kuhn–Tucker conditions and solving convex optimization programs. This deconfounding technique by spectral transformation requires no prior knowledge of hidden confounders. This novel debiasing approach improves over recent work by not assuming a sparse precision matrix, making it more suitable for cases with intrinsic covariate correlations. Simulations show that the proposed method corrects both biases and provides more precise coefficient estimates than existing approaches. We also apply the proposed method to a deoxyribonucleic acid methylation dataset from the Alzheimer’s disease (AD) neuroimaging initiative database to investigate the association between cerebrospinal fluid tau protein levels and AD severity. Availability and implementation The code for the proposed method is available on GitHub (https://github.com/Li-Zhaoy/Dec-Deb.git) and archived on Zenodo (DOI: https://10.5281/zenodo.15478745).
Zijian GuoDomagoj ĆevidPeter Bühlmann
Yan‐Yong ZhaoYuchun ZhangYuan LiuNoriszura Ismail
Alexis BellotMihaela van der Schaar