JOURNAL ARTICLE

Indian Language Text Representation and Categorization Using Supervised Learning Algorithm

Abstract

In the Constitution of India, a provision is made for each of the Indian states to choose their own official language for communicating at the state level for official purpose. The availability of constantly increasing amount of textual data of various Indian regional languages in electronic form has accelerated. So the Classification of text documents based on languages is essential. The objective of the work is the representation and categorization of Indian language text documents using text mining techniques. Several text mining techniques such as naive Bayes classifier, k-Nearest-Neighbor classifier and decision tree for text categorization have been used.

Keywords:
Categorization Computer science Artificial intelligence Naive Bayes classifier Natural language processing Text categorization Classifier (UML) Decision tree Decision tree learning Representation (politics) Support vector machine

Metrics

15
Cited By
1.45
FWCI (Field Weighted Citation Impact)
11
Refs
0.85
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Algorithms and Data Compression
Physical Sciences →  Computer Science →  Artificial Intelligence
Web Data Mining and Analysis
Physical Sciences →  Computer Science →  Information Systems
© 2026 ScienceGate Book Chapters — All rights reserved.