JOURNAL ARTICLE

Inverse link prediction with graph convolutional networks for knowledge-preserving sparsification in cheminformatics

Elnaz Bangian TabriziMehrdad JalaliMahboobeh Houshmand

Year: 2025 Journal:   Journal Of Big Data Vol: 12 (1)   Publisher: Springer Science+Business Media

Abstract

Abstract Large-scale cheminformatics datasets, such as those used in drug discovery and materials science, are often represented as dense similarity graphs; however, their complexity hinders scalable analysis and interpretability. We propose a novel Inverse Link Prediction (ILP) framework, powered by Graph Neural Networks (GNNs), for knowledge-preserving graph sparsification, using Metal–Organic Framework (MOF) datasets as a case study. The framework comprises four key components: (1) Graph Convolutional Networks (GCNs) to predict edge importance based on node features, (2) ILP to compute inverse weights identifying redundant edges, (3) dual-weight analysis to integrate initial similarity weights with GCN-derived weights, and (4) modularity optimization to prune edges while preserving community structures and domain knowledge. Validated on MOF similarity graphs, the sparsified graphs maintain structural integrity and support robust performance across both graph-based (GCN, GraphRAGE) and non-graph-based (Gradient Boosting Trees, Logistic Regression, Naïve Bayes, Deep Neural Networks) machine learning models for tasks such as pore limiting diameter prediction. This Inverse Link Prediction with Graph Convolutional Networks (ILP-GCN) framework offers a scalable and interpretable solution for cheminformatics, with broad applications in material discovery and beyond. Graphical Abstract

Keywords:
Cheminformatics Computer science Scalability Interpretability Graph Machine learning Theoretical computer science Artificial intelligence Convolutional neural network Inverse Data mining Mathematics Bioinformatics

Metrics

2
Cited By
3.17
FWCI (Field Weighted Citation Impact)
41
Refs
0.82
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Machine Learning in Materials Science
Physical Sciences →  Materials Science →  Materials Chemistry
Metal-Organic Frameworks: Synthesis and Applications
Physical Sciences →  Chemistry →  Inorganic Chemistry
Advanced Graph Neural Networks
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.