Variable Selection for High-Dimensional Data with Interaction Effects

Li, Leiyue

doi:10.13023/etd.2024.284

ScienceGate Book Chapters

DISSERTATION

Variable Selection for High-Dimensional Data with Interaction Effects

Li, Leiyue

Year: 2024 University: UKnowledge (University of Kentucky) Publisher: University of Kentucky

DOI: 10.13023/etd.2024.284

Get Full-Text PDF Get Analytical Report

Abstract

For high-dimensional data where the number of variables greatly exceeds the number of observations, selecting important variables while maintaining the required heredity conditions can be challenging. This dissertation is structured into three interconnected parts. In the first part, we propose a variable selection method by implementing a well-known optimization technique, the Genetic Algorithm. An R package was developed to simplify the implementation and usage of the proposed method. We then propose another variable selection method by extending the study from the Genetic Algorithm to a different but related optimization technique, Simulated Annealing. We consider three different hierarchical structures in both studies. We compare the performance and efficiency of the two proposed algorithms using multiple simulation studies. In the last part of the dissertation, a transfer learning-inspired algorithm with a specific focus on studying microbiome-metabolome interactions is proposed. We compare the proposed method with other existing methods in terms of mean squared error, type-I error, and power. An application of this method to real-world data reveals biologically significant interactions between gut microbes and various bile acids.

Keywords:

Selection (genetic algorithm) Variable (mathematics) Genetic algorithm Feature selection Focus (optics) Variables

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

Gut microbiota and health

Life Sciences → Biochemistry, Genetics and Molecular Biology → Molecular Biology

Gene expression and cancer classification

Life Sciences → Biochemistry, Genetics and Molecular Biology → Molecular Biology

Statistical Methods and Inference

Physical Sciences → Mathematics → Statistics and Probability

Variable Selection for High-Dimensional Data with Interaction Effects

Abstract

Metrics

Topics

Related Documents

Variable Selection for Semi-Parametric Models with Interaction Under High Dimensional Data

Variable Selection in High Dimensional Data with Interactions

Variable selection via Lasso with high-dimensional proteomic data

PUlasso: High-dimensional variable selection with presence-only data

Multiple imputation in high-dimensional data with variable selection