Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning

Joel Ricci-López; Sergio A. Águila; Michael K. Gilson; Carlos A. Brizuela

doi:10.1021/acs.jcim.1c00511

ScienceGate Book Chapters

JOURNAL ARTICLE

Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning

Joel Ricci-López Sergio A. Águila Michael K. Gilson Carlos A. Brizuela

Year: 2021 Journal: Journal of Chemical Information and Modeling Vol: 61 (11)Pages: 5362-5376 Publisher: American Chemical Society

DOI: 10.1021/acs.jcim.1c00511

Get Full-Text PDF Get Analytical Report

Abstract

One of the main challenges of structure-based virtual screening (SBVS) is the incorporation of the receptor's flexibility, as its explicit representation in every docking run implies a high computational cost. Therefore, a common alternative to include the receptor's flexibility is the approach known as ensemble docking. Ensemble docking consists of using a set of receptor conformations and performing the docking assays over each of them. However, there is still no agreement on how to combine the ensemble docking results to obtain the final ligand ranking. A common choice is to use consensus strategies to aggregate the ensemble docking scores, but these strategies exhibit slight improvement regarding the single-structure approach. Here, we claim that using machine learning (ML) methodologies over the ensemble docking results could improve the predictive power of SBVS. To test this hypothesis, four proteins were selected as study cases: CDK2, FXa, EGFR, and HSP90. Protein conformational ensembles were built from crystallographic structures, whereas the evaluated compound library comprised up to three benchmarking data sets (DUD, DEKOIS 2.0, and CSAR-2012) and cocrystallized molecules. Ensemble docking results were processed through 30 repetitions of 4-fold cross-validation to train and validate two ML classifiers: logistic regression and gradient boosting trees. Our results indicate that the ML classifiers significantly outperform traditional consensus strategies and even the best performance case achieved with single-structure docking. We provide statistical evidence that supports the effectiveness of ML to improve the ensemble docking performance.

Keywords:

Docking (animal) Virtual screening Protein–ligand docking Computer science Artificial intelligence Ensemble learning Machine learning Computational biology Drug discovery Bioinformatics Biology

Metrics

Cited By

7.20

FWCI (Field Weighted Citation Impact)

119

Refs

0.97

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Computational Drug Discovery Methods

Physical Sciences → Computer Science → Computational Theory and Mathematics

Machine Learning in Materials Science

Physical Sciences → Materials Science → Materials Chemistry

Microbial Natural Products and Biosynthesis

Health Sciences → Medicine → Pharmacology

Improving Structure-Based Virtual Screening with Ensemble Docking and Machine Learning

Abstract

Metrics

Citation History

Topics

Related Documents

Improving Structure-Based Virtual Screening with Ensemble\nDocking and Machine Learning

Docking Score ML: Target-Specific Machine Learning Models Improving Docking-Based Virtual Screening in 155 Targets

Docking Score ML:Target-Specific Machine LearningModels Improving Docking-Based Virtual Screening in 155 Targets

Ensemble Machine Learning Approaches in Molecular Fingerprint based Virtual screening

Boosting Docking-Based Virtual Screening with Deep Learning