JOURNAL ARTICLE

Sentiment Polarity Classification using Minimal Feature Vectors and Machine Learning Algorithms

Abstract

Recently, social media users can comment as texts to describe their opinions. These texts can be analyzed to classify them into either positive or negative attitude. Feature vectors for representing the texts must be designed and prepared before building a classifier. Generally, texts are represented by vectors of weights or frequencies of terms that appear in the text. The length of the feature vector is equal to the number of terms in the dictionary derived from the possible words in all texts. The large amount of words in dictionary leads to the high dimensional vector for representing text and bring about the long processing time to training and testing the text classification models. This paper, the low-dimensional vectors, V8D, were proposed for representing the texts. The set of positive and negative words including the words of negation which have the significant meanings were considered as information to create these vectors. Four machine learning algorithms to solve the classification problem, i.e., k-Nearest Neighbors, Naïve Bayes classifier, Artificial Neural Networks and Support Vector Machine, were applied to classify the opinion texts. By experimenting on eight data sets with various domains, the proposed V8D vectors were compared with the traditional TF-IDF vector in term of the predictive correctness. The experimental results show that representing text as our V8D vector for opinion text classification can provide the best efficiency in both of space usage and processing time.

Keywords:
Artificial intelligence Computer science Correctness Feature vector Support vector machine Classifier (UML) Naive Bayes classifier Vector space model Sentiment analysis Machine learning Artificial neural network Negation Algorithm Natural language processing Pattern recognition (psychology)

Metrics

2
Cited By
0.14
FWCI (Field Weighted Citation Impact)
18
Refs
0.54
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Sentiment Analysis and Opinion Mining
Physical Sciences →  Computer Science →  Artificial Intelligence
Text and Document Classification Technologies
Physical Sciences →  Computer Science →  Artificial Intelligence
Advanced Text Analysis Techniques
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.