JOURNAL ARTICLE

Bayesian inference of the gene expression states of single cells from scRNA-seq data

Breda, JérémieZavolan, Mihaelavan Nimwegen, Erik

Year: 2019 Journal:   Zenodo (CERN European Organization for Nuclear Research)   Publisher: European Organization for Nuclear Research

Abstract

In spite of a large investment in the development of methodologies for analysis of single-
cell RNA-seq data, there is still little agreement on how to best normalize such data, i.e.
how to quantify gene expression states of single cells from such data. Starting from a few
basic requirements such as that inferred expression states should correct for both intrinsic
biological fluctuations and measurement noise, and that changes in expression state should
be measured in terms of fold-changes rather than changes in absolute levels, we here derive
a unique Bayesian procedure for normalizing single-cell RNA-seq data from first principles.
Our implementation of this normalization procedure, called Sanity (SAmpling Noise cor-
rected Inference of Transcription activitY), estimates log expression values and associated
errors bars directly from raw UMI counts without any tunable parameters.
Comparison of Sanity with other recent normalization methods on a selection of scRNA-
seq datasets shows that Sanity outperforms other methods on basic downstream processing
tasks such as finding the nearest-neighbors of each cell, clustering cells into subtypes, and
identification of differentially expressed genes. More importantly, we show that all other
normalization methods present severely distorted pictures of the data. By failing to account
for biological and technical Poisson noise, many methods systematically predict the lowest
expressed genes to be most variable in expression, whereas in reality these genes provide least
evidence of true biological variability. In addition, by confounding noise removal with lower-
dimensional representation of the data, many methods introduce strong spurious correlations
of expression levels with the total UMI count of each cell as well as spurious co-expression
of genes.

Keywords:
Spurious relationship Normalization (sociology) Bayesian probability Expression (computer science) Inference Pattern recognition (psychology) Cluster analysis Bayes' theorem Gene expression profiling

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
0
Refs
0.34
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Single-cell and spatial transcriptomics
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Microfluidic and Bio-sensing Technologies
Physical Sciences →  Engineering →  Biomedical Engineering
Cell Image Analysis Techniques
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Biophysics

Related Documents

JOURNAL ARTICLE

Bayesian inference of the gene expression states of single cells from scRNA-seq data

Jérémie BredaMihaela ZavolanErik van Nimwegen

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2019
JOURNAL ARTICLE

Bayesian inference of the gene expression states of single cells from scRNA-seq data

Jérémie BredaMihaela ZavolanErik van Nimwegen

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2019
JOURNAL ARTICLE

Bayesian inference of the gene expression states of single cells from scRNA-seq data

Jérémie BredaMihaela ZavolanErik van Nimwegen

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2019
JOURNAL ARTICLE

Bayesian inference of the gene expression states of single cells from scRNA-seq data - Datasets

Breda, JérémieZavolan, MihaelaNimwegen, Erik Van

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2020
JOURNAL ARTICLE

Bayesian inference of the gene expression states of single cells from scRNA-seq data - Datasets

Breda, JérémieZavolan, MihaelaNimwegen, Erik Van

Journal:   Zenodo (CERN European Organization for Nuclear Research) Year: 2020
© 2026 ScienceGate Book Chapters — All rights reserved.