JOURNAL ARTICLE

Computational Methods for Inferring Transcription Factor Binding Sites

Vyacheslav Morozov

Year: 2012 Journal:   uO Research (University of Ottawa)   Publisher: University of Ottawa

Abstract

Position weight matrices (PWMs) have become a tool of choice for the identification of transcription factor binding sites in DNA sequences. PWMs are compiled from experimentally verified and aligned binding sequences. PWMs are then used to computationally discover novel putative binding sites for a given protein. DNA-binding proteins often show degeneracy in their binding requirement, the overall binding specificity of many proteins is unknown and remains an active area of research. Although PWMs are more reliable predictors than consensus string matching, they generally result in a high number of false positive hits. A previous study introduced a novel method to PWM training based on the known motifs to sample additional putative binding sites from a proximal promoter area. The core idea was further developed, implemented and tested in this thesis with a large scale application. Improved mono- and dinucleotide PWMs were computed for Drosophila melanogaster. The Matthews correlation coefficient was used as an optimization criterion in the PWM refinement algorithm. New PWMs keep an account of non-uniform background nucleotide distributions on the promoters and consider a larger number of new binding sites during the refinement steps. The optimization included the PWM motif length, the position on the promoter, the threshold value and the binding site location. The obtained predictions were compared for mono- and dinucleotide PWM versions with initial matrices and with conventional tools. The optimized PWMs predicted new binding sites with better accuracy than conventional PWMs.

Keywords:
Transcription factor Computer science Computational biology DNA binding site Biology Genetics Promoter Gene Gene expression

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
81
Refs
0.27
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Genomics and Chromatin Dynamics
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
RNA and protein synthesis mechanisms
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
RNA Research and Splicing
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology

Related Documents

JOURNAL ARTICLE

Deep learning for inferring transcription factor binding sites

Peter K. KooMatt Ploenzke

Journal:   Current Opinion in Systems Biology Year: 2020 Vol: 19 Pages: 16-23
JOURNAL ARTICLE

Inferring gene correlation networks from transcription factor binding sites

Ghasem MahdevarAbbas Nowzari-DaliniMehdi Sadeghi

Journal:   Genes & Genetic Systems Year: 2013 Vol: 88 (5)Pages: 301-309
JOURNAL ARTICLE

Computational study of transcription factor binding sites

Romain Groux

Journal:   Infoscience (Ecole Polytechnique Fédérale de Lausanne) Year: 2020
© 2026 ScienceGate Book Chapters — All rights reserved.