JOURNAL ARTICLE

Multi-Label Classification of Microblogging Texts Using Convolution Neural Network

Md. Aslam ParwezMuhammad AbulaishJahiruddin

Year: 2019 Journal:   IEEE Access Vol: 7 Pages: 68678-68691   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Microblogging sites contain a huge amount of textual data and their classification is an imperative task in many applications, such as information filtering, user profiling, topical analysis, and content tagging. Traditional machine learning approaches mainly use a bag of words or n-gram techniques to generate feature vectors as text representation to train classifiers and perform considerably well for many text information processing tasks. Since short texts, such as tweets, contain a very limited number of words, the traditional machine learning approaches suffer from data sparsity and curse of dimensionality problems due to feature representation using a bag of words or n-grams techniques. Nowadays, the use of feature vectors, such as word embeddings, as an input to neural networks for text classification and clustering has shown a remarkable performance gain. In this paper, we present the different neural network models for multi-label classification of microblogging data. The proposed models are based on convolutional neural network (CNN) architectures, which utilize pre-trained word embeddings from generic and domain-specific textual data sources. The word embeddings are used individually and in various combinations through different channels of CNN to predict class labels. We also present a comparative analysis of the proposed CNN models with traditional machine learning models and one of the existing CNN architectures. The proposed models are evaluated over a real Twitter dataset, and the experimental results establish their efficacy to classify microblogging texts with improved accuracy in comparison with the traditional machine learning approaches and the existing CNN models.

Keywords:
Computer science Microblogging Artificial intelligence Convolutional neural network Word2vec Machine learning Social media Feature (linguistics) Curse of dimensionality Cluster analysis Artificial neural network Word (group theory) Feature learning Natural language processing

Metrics

55
Cited By
5.07
FWCI (Field Weighted Citation Impact)
43
Refs
0.96
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Sentiment Analysis and Opinion Mining
Physical Sciences →  Computer Science →  Artificial Intelligence
Spam and Phishing Detection
Physical Sciences →  Computer Science →  Information Systems
© 2026 ScienceGate Book Chapters — All rights reserved.