Comparative Performance of Retrieval Augmented Generation Tourism Chatbots

Amar Al Farizi; Primandani Arsi; Pungkas Subarkah

doi:10.21070/ijins.v27i1.1836

ScienceGate Book Chapters

JOURNAL ARTICLE

Comparative Performance of Retrieval Augmented Generation Tourism Chatbots

Amar Al Farizi Primandani Arsi Pungkas Subarkah

Year: 2026 Journal: Indonesian Journal of Innovation Studies Vol: 27 (1) Publisher: Universitas Muhammadiyah Sidoarjo

DOI: 10.21070/ijins.v27i1.1836

Get Full-Text PDF Get Analytical Report

Abstract

General Background: The rapid adoption of artificial intelligence in smart tourism has increased the use of contextual chatbots to deliver destination information efficiently. Specific Background: However, tourism chatbots based on Large Language Models frequently encounter information hallucination, reducing reliability when handling dynamic and local tourism data. Knowledge Gap: Existing studies mainly focus on rule-based or single-model chatbot implementations and provide limited comparative evaluation of Retrieval Augmented Generation configurations combining embedding models and Large Language Models. Aims: This study aims to comparatively evaluate multiple Retrieval Augmented Generation configurations to identify the most suitable combination for contextual tourism chatbots and to analyze differences between large multilingual and small monolingual embedding models using a local tourism dataset. Results: Experimental evaluation using data from 49 tourist destinations in Banyumas Regency shows that the Multilingual-E5-Large embedding model consistently achieves perfect Precision, Recall, and F1-Score across all tested Large Language Models. The combination of Multilingual-E5-Large and GPT-4.1-Mini demonstrates the most balanced performance, achieving a BERTScore F1 of 0.7515 with an average response time of 1.555 seconds. Novelty: This research provides a systematic comparative assessment of embedding capacity and Large Language Model selection within a unified Retrieval Augmented Generation framework for tourism chatbots. Implications: The findings offer practical guidance for selecting model configurations that ensure accurate retrieval, high-quality responses, and efficient system performance in contextual tourism information services. Highlights • Multilingual embedding models deliver consistently higher retrieval accuracy across all tested configurations• GPT-4.1-Mini produces the most balanced generative quality and response latency• Embedding model selection plays a more decisive role than language model variation Keywords Retrieval Augmented Generation; Tourism Chatbot; Large Language Model; Embedding Model; Comparative Evaluation

Keywords:

Tourism Selection (genetic algorithm) Language model Embedding Focus (optics) Implementation Chatbot Quality (philosophy)

Metrics

Cited By

0.00

FWCI (Field Weighted Citation Impact)

Refs

0.83

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Topics

AI in Service Interactions

Physical Sciences → Computer Science → Artificial Intelligence

Information Retrieval and Data Mining

Physical Sciences → Computer Science → Information Systems

Digital Marketing and Social Media

Social Sciences → Social Sciences → Sociology and Political Science

Comparative Performance of Retrieval Augmented Generation Tourism Chatbots

Abstract

Metrics

Topics

Related Documents

Optimizing Retrieval Augmented Generation Chatbots: A Comparative Analysis

Retrieval-Augmented Generation (RAG) Chatbots

A Comparative Study of Retrieval-Augmented Generation (RAG) Chatbots

Multimodal Retrieval-Augmented Generation for Context-Aware Chatbots

Retrieval-augmented generation: The technical foundation of intelligent AI Chatbots