JOURNAL ARTICLE

Locality-Aware Transformer for Video-Based Sign Language Translation

Zihui GuoYonghong HouChunping HouWenjie Yin

Year: 2023 Journal:   IEEE Signal Processing Letters Vol: 30 Pages: 364-368   Publisher: Institute of Electrical and Electronics Engineers

Abstract

Recently, the application of transformer makes significant progress in sign language translation. However, several characteristics of sign videos are neglected in existing transformer-based methods that hinder translation performance. Firstly, in sign videos, multiple consecutive frames represent a single sign gloss thus the local temporal relations are crucial. Secondly, the inconsistency between video and text demands the non-local and global context modeling ability of the model. To address these issues, a locality-aware transformer is proposed for sign language translation. Concretely, the multi-stride position encoding scheme assigns the same position index to adjacent frames with various strides to enhance the local dependency. Afterward, the adaptive temporal interaction module is utilized to capture non-local and flexible local frame correlation simultaneously. Moreover, a gloss counting task is designed to facilitate the holistic understanding of sign videos. Experimental results on two benchmark datasets demonstrate the effectiveness of the proposed framework.

Keywords:
Computer science Locality Sign language Transformer Artificial intelligence Computer vision Speech recognition Voltage

Metrics

14
Cited By
3.42
FWCI (Field Weighted Citation Impact)
31
Refs
0.90
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Hand Gesture Recognition Systems
Physical Sciences →  Computer Science →  Human-Computer Interaction
Human Pose and Action Recognition
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
© 2026 ScienceGate Book Chapters — All rights reserved.