JOURNAL ARTICLE

CHI - SQUARE AND INFORMATION GAIN FEATURE SELECTION FOR HOTEL REVIEW SENTIMENT ANALYSIS USING SUPPORT VECTOR MACHINE

Nathanael KaruniaYonathan Purbo Santosa

Year: 2024 Journal:   Proxies Jurnal Informatika Vol: 5 (2)Pages: 146-161   Publisher: Soegijapranata Catholic University

Abstract

In the current era, it has become a trend for people to order tickets online through online booking sites and applications, both in terms of transportation such as planes, vacations such as tours, and also lodging such as hotels. To get a good hotel, you need a review from people who have booked it. With the reviews written by visitors to the site or mobile application, they will then be analyzed so that an output can be produced that can be useful. One of the analytical models that can be done is sentiment analysis. The purpose of this study is to find the best method in analyzing sentiment based on the preprocessing of the data and hopefully it can produce knowledge in the form of sentiment analysis classification methods in order to determine a good method devoted to the data preprocessing section. The algorithm used to make this sentiment classification analysis is the Support Vector Machine using 3 feature selection methods, namely not using the selection feature, using the chi square selection feature, and using the information gain selection feature. The process consists of five steps in this study, which include several activities. namely data collection, preprocessing, feature extraction, feature selection, classification, and calculating accuracy. In the process of calculating accuracy, I used the Confusion Matrix method to find the best method of the three based on the accuracy results obtained. The results of the 3 uses of the feature selection method that were carried out were using the chi square feature selection method, the highest results were obtained, namely with an average accuracy of 86.68% which was followed by the use of the information gain selection feature which obtained an average accuracy of 85.78% and the last one was followed by the method not using the selection feature which got an average accuracy of 85.24%. From the results of the three methods, it can be concluded that the use of the chi square feature selection method in the case of sentiment analysis on hotel reviews is the best compared to the other two.

Keywords:
Feature selection Computer science Preprocessor Sentiment analysis Confusion matrix Data pre-processing Data mining Feature (linguistics) Support vector machine Selection (genetic algorithm) Artificial intelligence Feature extraction Pattern recognition (psychology) Process (computing)

Metrics

0
Cited By
0.00
FWCI (Field Weighted Citation Impact)
9
Refs
0.23
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Topics

Multimedia Learning Systems
Physical Sciences →  Computer Science →  Information Systems
Data Mining and Machine Learning Applications
Physical Sciences →  Computer Science →  Information Systems
Information Retrieval and Data Mining
Physical Sciences →  Computer Science →  Information Systems

Related Documents

JOURNAL ARTICLE

Analysis Sentiment Lumajang Square Review using Support Vector Machine

Maysas Yafi Urrochman

Journal:   Journal of Informatics Development Year: 2025 Vol: 3 (2)Pages: 47-57
JOURNAL ARTICLE

Sentiment Analysis of Hotel Reviews Using Support Vector Machine

Alexander Romian SimarmataMuhammad Zakariyah

Journal:   Indonesian Journal of Computer Science Year: 2023 Vol: 12 (5)
JOURNAL ARTICLE

SENTIMENT ANALYSIS USING SUPPORT VECTOR MACHINE BASED ON FEATURE SELECTION AND SEMANTIC ANALYSIS

Dr.Arivoli ASonali Pandey

Journal:   International Research Journal of Computer Science Year: 2021 Vol: 8 (8)Pages: 209-214
© 2026 ScienceGate Book Chapters — All rights reserved.