JOURNAL ARTICLE

Solving document clustering problem through meta heuristic algorithm

Abstract

The paper proposed a soft computing approach to solve document clustering problem. Document clustering is a specialized clustering problem in which textual documents autonomously segregated to a number of identifiable, subject homogenous and smaller sub-collections (also called clusters). Identifying implicit textual patterns within the documents is a challenging aspect as there can be thousands of such textual features. Partition clustering algorithm like k-means is mainly used for this problem. There are several drawbacks in k-means algorithm such as (i) initial seeds dependency, and (ii) it traps into local optimal solution. Although every k-means solution may contain some good partial arrangements for clustering. Meta-heuristic algorithm like Black Hole (BH) uses certain trade-off of randomization and local search for finding the optimal and near optimal solution. Our motivation comes from the fact that meta-heuristic optimization can quickly produce a global optimal solution using random k-means initial solution. The contributions from this research are (i) an implementation of black hole algorithm using k-mean as embedding (ii) The phenomena of global search and local search optimization are used as parameters adjustments. A series of experiments are performed with our proposed method on standard text mining datasetslike: (i) NEWS20, (ii) Reuters and (iii) WebKB and results are evaluated on Purity and Silhouette Index. In comparison the proposed method outperforms the basic k-means, GA with k-means embedding and quickly converges to global or near global optimal solution.

Keywords:
Computer science Cluster analysis Meta heuristic Heuristic Algorithm Artificial intelligence

Metrics

4
Cited By
0.45
FWCI (Field Weighted Citation Impact)
16
Refs
0.69
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Data Mining Algorithms and Applications
Physical Sciences →  Computer Science →  Information Systems
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence

Related Documents

JOURNAL ARTICLE

About heuristic algorithm for Correlation Clustering problem solving

Soldatenko, A.A.Semenova, D.V.Ibragimova, E.I.

Journal:   Russian Agency for Digital Standardization Year: 2023
JOURNAL ARTICLE

A meta-heuristic method for solving scheduling problem: crow search algorithm

Antono AdhiBudi SantosaNurhadi Siswanto

Journal:   IOP Conference Series Materials Science and Engineering Year: 2018 Vol: 337 Pages: 012003-012003
JOURNAL ARTICLE

An efficient meta-heuristic algorithm for solving capacitated vehicle routing problem

Alfian FaizSubiyanto SubiyantoUlfah Mediaty Arief

Journal:   International Journal of Advances in Intelligent Informatics Year: 2018 Vol: 4 (3)Pages: 212-212
JOURNAL ARTICLE

Solving the vehicle routing problem by a hybrid meta-heuristic algorithm

Majid YousefikhoshbakhtEsmaile Khorram

Journal:   Journal of industrial engineering international Year: 2012 Vol: 8 (1)
© 2026 ScienceGate Book Chapters — All rights reserved.