JOURNAL ARTICLE

Methods for Integrative Analysis of Multi-Omics Data

Abstract

Integrating multi-omics data has been a meaningful step in gaining a better understanding of diseases. Currently, simple correlation-based analyses are typically used in analyzing multi-omics data. However, such methods are not suitable for some data types, such as count data with overdispersion and excess zeros, without removing samples or features or applying transformations that make interpretation difficult. In our first aim, we utilize model-based correlations to reveal true relationships between metabolites and microbial species and to compare correlation patterns between ECC disease groups in the ZOE 2.0 study. Based on simulations, BZINB model-based correlation is, on average, more accurate than Spearman’s rank correlation and Pearson correlation for positively correlated vector pairs typical in microbiome and metabolome data. We utilize the correlation in identifying co-abundant modules in microbiome data using similarity-based clustering, and we identify biologically-relevant differentially correlated metabolites and species in the two ECC disease groups. In our second aim, we develop a clustering method for multi-omics features across multiple layers (metagenome, metatranscriptome, and metabolome) to identify co-abundance and co-regulation modules while assuming the directions of regulatory effect. To construct the affinity matrix, we used a rank-based similarity measure to account for zero-inflation. We predict one layer from another where there is prior knowledge of regulatory effects, using penalized zero-inflated negative binomial regression, to infer directed sparse regulatory effects. This, combined with the within-layer affinity matrix, results in a single affinity matrix that has a block structure that represents the similarities between and within each layer, to which we apply an asymmetrical normalized cut-based clustering algorithm. We evaluate the accuracy of the method in comparison to popular existing methods through simulation of zero-inflated negative binomial and rounded lognormal count data to represent multiple omics layers. In our third aim, we predict the abundances of commensal microbial genera based on DNA methylation in TCGA tissue samples. Methylation CpG sites are selected using regularized regression methods. Further development of this framework can potentially lead to multi-tissue prediction models for microbial taxa that may then be used to test association with clinical traits of interest as more data become available in the future.

Keywords:
Correlation Cluster analysis Overdispersion Rank correlation Metabolome Similarity (geometry) Rank (graph theory) Measure (data warehouse)

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.25
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Gut microbiota and health
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Epigenetics and DNA Methylation
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Bioinformatics and Genomic Networks
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
© 2026 ScienceGate Book Chapters — All rights reserved.