JOURNAL ARTICLE

An updated dataset and a structure‐based prediction model for protein– RNA binding affinity

Xu HongXiaoxue TongJuan XiePinyu LiuXudong LiuQi SongSen LiuShiyong LiuShiyong LiuShiyong Liu

Year: 2023 Journal:   Proteins Structure Function and Bioinformatics Vol: 91 (9)Pages: 1245-1253   Publisher: Wiley

Abstract

Abstract Understanding the process of protein–RNA interaction is essential for structural biology. The thermodynamic process is an important part to uncover the protein–RNA interaction mechanism. The regulatory networks between protein and RNA in organisms are dominated by the binding or dissociation in the cells. Therefore, determining the binding affinity for protein–RNA complexes can help us to understand the regulation mechanism of protein–RNA interaction. Since it is time‐consuming and labor‐intensive to determine the binding affinity for protein–RNA complexes by experimental methods, it is necessary and urgent to develop computational methods to predict that. To develop a binding affinity prediction model, first we update the dataset of protein–RNA binding affinity benchmark (PRBAB), which includes 145 complexes now. Second, we extract the structural features based on complex structure, and then we analyze and select the representative structural features to train the regression model. Third, we random select the subset from the PRBAB2.0 to fit the protein–RNA binding affinity determined by experiment. In the end, we tested our model on the nonredundant PDBbind dataset, and the results showed that Pearson correlation coefficient r = .57 and RMSE = 2.51 kcal/mol. The Pearson correlation coefficient achieves 0.7 while removing 5 complex structures with modified residues/nucleotides and metal ions. While testing on ProNAB, the results showed that 71.60% of the prediction achieves Pearson correlation coefficient r = .61 and RMSE = 1.56 kcal/mol with experiment values.

Keywords:
RNA Computational biology RNA-binding protein Pearson product-moment correlation coefficient Binding site Correlation coefficient Biological system Chemistry Linear regression Computer science Biochemistry Mathematics Biology Machine learning Statistics Gene

Metrics

10
Cited By
1.86
FWCI (Field Weighted Citation Impact)
51
Refs
0.84
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

RNA and protein synthesis mechanisms
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
RNA Research and Splicing
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
Protein Structure and Dynamics
Life Sciences →  Biochemistry, Genetics and Molecular Biology →  Molecular Biology
© 2026 ScienceGate Book Chapters — All rights reserved.