JOURNAL ARTICLE

Improving Performance Sentiment Movie Review Classification Using Hybrid Feature TFIDF, N-Gram, Information Gain and Support Vector Machine

Sutriawan SutriawanMuljono MuljonoKhairunnisa KhairunnisaZumhur AlaminTeguh Ansyor LorosaeSahrul Ramadhan

Year: 2024 Journal:   Mathematical Modelling and Engineering Problems Vol: 11 (2)Pages: 375-384   Publisher: International Information and Engineering Association

Abstract

The use of online movie streaming media has increased significantly, particularly among movie enthusiasts.However, fan comments are frequently informal and comprise informal language, subjectivity, and contexts that reflect their preferences.A significant challenge in sentiment analysis of movie reviews is how to classify sentiments in reviews that are often unstructured and subjective.This study aims to improve the accuracy of sentiment classification in movie reviews by proposing several methods, including a hybrid TF-IDF+N-Gram model that can extract pertinent information from word and phrase sequences in reviews.Then, feature selection with Information Gain (IG) is performed to identify the most informative sentiment classification features.This strategy seeks to overcome informal language and noise to improve review context comprehension.The results demonstrated a significant gain in the accuracy of sentiment classification.TFIDF+Bigram+IG achieved 78% accuracy (up 8% from 70% previously), and TFIDF+Trigram+IG achieved 66% accuracy (up 22% from 44% previously).Using this hybrid model, the study significantly enhanced the accuracy of sentiment classification, thereby enhancing the performance of SVM in the face of complex movie evaluations.

Keywords:
tf–idf Support vector machine n-gram Computer science Feature (linguistics) Sentiment analysis Information gain Information retrieval Artificial intelligence Pattern recognition (psychology) Term (time) Physics

Metrics

3
Cited By
2.41
FWCI (Field Weighted Citation Impact)
39
Refs
0.84
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Customer churn and segmentation
Social Sciences →  Business, Management and Accounting →  Marketing
Stock Market Forecasting Methods
Social Sciences →  Decision Sciences →  Management Science and Operations Research
AI and Big Data Applications
Physical Sciences →  Computer Science →  Information Systems
© 2026 ScienceGate Book Chapters — All rights reserved.