JOURNAL ARTICLE

Document Clustering Using Graph Based Fuzzy Association Rule Generation

P. Perumal

Year: 2022 Journal:   Computer Systems Science and Engineering Vol: 43 (1)Pages: 203-218

Abstract

With the wider growth of web-based documents, the necessity of automatic document clustering and text summarization is increased. Here, document summarization that is extracting the essential task with appropriate information, removal of unnecessary data and providing the data in a cohesive and coherent manner is determined to be a most confronting task. In this research, a novel intelligent model for document clustering is designed with graph model and Fuzzy based association rule generation (gFAR). Initially, the graph model is used to map the relationship among the data (multi-source) followed by the establishment of document clustering with the generation of association rule using the fuzzy concept. This method shows benefit in redundancy elimination by mapping the relevant document using graph model and reduces the time consumption and improves the accuracy using the association rule generation with fuzzy. This framework is provided in an interpretable way for document clustering. It iteratively reduces the error rate during relationship mapping among the data (clusters) with the assistance of weighted document content. Also, this model represents the significance of data features with class discrimination. It is also helpful in measuring the significance of the features during the data clustering process. The simulation is done with MATLAB 2016b environment and evaluated with the empirical standards like Relative Risk Patterns (RRP), ROUGE score, and Discrimination Information Measure (DMI) respectively. Here, DailyMail and DUC 2004 dataset is used to extract the empirical results. The proposed gFAR model gives better trade-off while compared with various prevailing approaches.

Keywords:
Automatic summarization Computer science Cluster analysis Data mining Fuzzy clustering Association rule learning Fuzzy logic Graph Redundancy (engineering) Rand index Artificial intelligence Machine learning Theoretical computer science

Metrics

5
Cited By
0.98
FWCI (Field Weighted Citation Impact)
33
Refs
0.74
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Clustering Algorithms Research
Physical Sciences →  Computer Science →  Artificial Intelligence
Data Mining Algorithms and Applications
Physical Sciences →  Computer Science →  Information Systems
© 2026 ScienceGate Book Chapters — All rights reserved.