Dynamic Sign Language Recognition in Bahasa using MediaPipe, Long Short-Term Memory, and Convolutional Neural Network

Ivana Valentina Lemmuela; Mewati Ayub; Oscar Karnalim

doi:10.20473/jisebi.11.1.17-29

ScienceGate Book Chapters

JOURNAL ARTICLE

Dynamic Sign Language Recognition in Bahasa using MediaPipe, Long Short-Term Memory, and Convolutional Neural Network

Ivana Valentina Lemmuela Mewati Ayub Oscar Karnalim

Year: 2025 Journal: Journal of Information Systems Engineering and Business Intelligence Vol: 11 (1)Pages: 17-29 Publisher: Airlangga University

DOI: 10.20473/jisebi.11.1.17-29

Get Full-Text PDF Get Analytical Report

Abstract

Background: Communication is important for everyone, including individuals with hearing and speech impairments. For this demographic, sign language is widely used as the primary medium of communication with others who share similar conditions or with hearing individuals who understand sign language. However, communication difficulties arise when individuals with these impairments attempt to interact with those who do not understand sign language. Objective: This research aims to develop models capable of recognizing sign language movements in Bahasa and converting the detected gesture into corresponding words, with a focus on vocabularies related to religious activities. Specifically, the research examined dynamic sign language in Bahasa, which comprised gestures requiring motion for proper demonstration. Methods: In accordance with the research objective, sign language recognition model was developed using MediaPipe-assisted extraction process. Recognition of dynamic sign language in Bahasa was achieved through the application of Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) methods. Results: Sign language recognition model developed using bidirectional LSTM showed the best result with a testing accuracy of 100%. However, the best result for the CNN alone was 86.67 %. The integration of CNN and LSTM was observed to improve performance than CNN alone, with the best CNN-LSTM model achieving an accuracy of 95.24%. Conclusion: The bidirectional LSTM model outperformed the unidirectional LSTM by capturing richer temporal information, with a specific consideration of both past and future time steps. Based on the observations made, CNN alone could not match the effectiveness of the Bidirectional LSTM, but a combination of CNN with LSTM produced better results. It is also important to state that normalized landmark data was found to significantly improve accuracy. Accuracy within this context was also influenced by shot type variability and specific landmark coordinates. Furthermore, the dataset containing straight-shot videos with x and y coordinates provided more accurate results, dissimilar to those comprised of videos with shot variation, which typically require x, y, and z coordinates for optimal accuracy. Keywords: Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), MediaPipe, Sign Language

Keywords:

Term (time) Sign (mathematics) Convolutional neural network Speech recognition Computer science Artificial neural network Pattern recognition (psychology) Long short term memory Artificial intelligence Short-term memory Mathematics Recurrent neural network Cognition Psychology

Metrics

Cited By

6.16

FWCI (Field Weighted Citation Impact)

Refs

0.85

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Hand Gesture Recognition Systems

Physical Sciences → Computer Science → Human-Computer Interaction

Edcuational Technology Systems

Physical Sciences → Computer Science → Artificial Intelligence

English Language Learning and Teaching

Physical Sciences → Computer Science → Information Systems

Dynamic Sign Language Recognition in Bahasa using MediaPipe, Long Short-Term Memory, and Convolutional Neural Network

Abstract

Metrics

Citation History

Topics

Related Documents

Recognition of Dynamic Filipino Sign Language using MediaPipe and Long Short-Term Memory

Sign Language Detection Using Mediapipe and Long-Short Term Memory Network

Gesture Recognition of Filipino Sign Language Using Convolutional and Long-Short Term Memory Neural Network

Gesture Recognition of Filipino Sign Language Using Convolutional and Long Short-Term Memory Deep Neural Networks

Convergence Speed up Using Convolutional Neural Network Combining with Long Short-Term Memory for American Sign Language Alphabet Recognition