Deconfounded and debiased estimation for high-dimensional linear regression under hidden confounding with application to omics data

Zhaoyang Li; Yahang Liu; Kecheng Wei; Yongfu Yu; Guoyou Qin; Zhongyi Zhu

doi:10.1093/bioinformatics/btaf400

ScienceGate Book Chapters

JOURNAL ARTICLE

Deconfounded and debiased estimation for high-dimensional linear regression under hidden confounding with application to omics data

Zhaoyang Li Yahang Liu Kecheng Wei Yongfu Yu Guoyou Qin Zhongyi Zhu

Year: 2025 Journal: Bioinformatics Vol: 41 (7) Publisher: Oxford University Press

DOI: 10.1093/bioinformatics/btaf400

Get Full-Text PDF Get Analytical Report

Abstract

Abstract Motivation A critical challenge in observational studies arises from the presence of hidden confounders in high-dimensional data. This leads to biases in causal effect estimation due to both hidden confounding and high-dimensional estimation. Some classical deconfounding methods are inadequate for high-dimensional scenarios and typically require prior information on hidden confounders. We propose a two-step deconfounded and debiased estimation for high-dimensional linear regression with hidden confounding. Results First, we reduce hidden confounding via spectral transformation. Second, we correct bias from the weighted ℓ1 penalty, commonly used in high-dimensional estimation, by inverting the Karush–Kuhn–Tucker conditions and solving convex optimization programs. This deconfounding technique by spectral transformation requires no prior knowledge of hidden confounders. This novel debiasing approach improves over recent work by not assuming a sparse precision matrix, making it more suitable for cases with intrinsic covariate correlations. Simulations show that the proposed method corrects both biases and provides more precise coefficient estimates than existing approaches. We also apply the proposed method to a deoxyribonucleic acid methylation dataset from the Alzheimer’s disease (AD) neuroimaging initiative database to investigate the association between cerebrospinal fluid tau protein levels and AD severity. Availability and implementation The code for the proposed method is available on GitHub (https://github.com/Li-Zhaoy/Dec-Deb.git) and archived on Zenodo (DOI: https://10.5281/zenodo.15478745).

Keywords:

Confounding Covariate Computer science Regression Statistics Artificial intelligence Data mining Algorithm Mathematics Machine learning

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.21

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Statistical Methods and Inference

Physical Sciences → Mathematics → Statistics and Probability

Advanced Causal Inference Techniques

Physical Sciences → Mathematics → Statistics and Probability

Statistical Methods and Bayesian Inference

Physical Sciences → Mathematics → Statistics and Probability

Deconfounded and debiased estimation for high-dimensional linear regression under hidden confounding with application to omics data

Abstract

Metrics

Topics

Related Documents

Doubly debiased lasso: High-dimensional inference under hidden confounding

Distributed debiased estimation of high-dimensional partially linear models with jumps

Sparse and debiased lasso estimation and inference for high-dimensional composite quantile regression with distributed data

Two-Stage Online Debiased Lasso Estimation and Inference for High-Dimensional Quantile Regression with Streaming Data

Linear Deconfounded Score Method: Scoring DAGs With Dense Unobserved Confounding