JOURNAL ARTICLE

Global-Aware Attention Network for Multi-modal Sarcasm Detection

Abstract

Sarcasm detection is crucial for natural language processing in various applications, such as affective computing and opinion mining. Multi-modal sarcasm detection, which combines information from different modalities, has attracted increasing attention in recent years. However, many current methods concatenate image and text features directly without considering the contextual information between the cross-modal alignment and single-modal features simultaneously. Inspired by this observation, we propose a novel Global-Aware Attention Network (GAAN) for multi-modal sarcasm detection. Specifically, we investigate a cross-modal multi-granularity alignment module that captures align context features through coarse-grained and fine-grained attention. More importantly, considering the complementary effects of single-modal contextual information in sarcasm detection, we fuse textual, visual context features and alignment context features to obtain the global context features. We conducted extensive experiments on public datasets, and the results compared to the baselines illustrate that our proposed model obtains state-of-the-art performance in multi-modal sarcasm detection.

Keywords:
Sarcasm Computer science Modal Context (archaeology) Granularity Artificial intelligence Modalities Natural language processing Linguistics

Metrics

6
Cited By
1.53
FWCI (Field Weighted Citation Impact)
25
Refs
0.83
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Sentiment Analysis and Opinion Mining
Physical Sciences →  Computer Science →  Artificial Intelligence
Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Topic Modeling
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.