JOURNAL ARTICLE

Generating visual explanations with natural language

Abstract

Abstract We generate natural language explanations for a fine‐grained visual recognition task. Our explanations fulfill two criteria. First, explanations are class discriminative , meaning they mention attributes in an image which are important to identify a class. Second, explanations are image relevant , meaning they reflect the actual content of an image. Our system, composed of an explanation sampler and phrase‐critic model, generates class discriminative and image relevant explanations. In addition, we demonstrate that our explanations can help humans decide whether to accept or reject an AI decision.

Keywords:
Discriminative model Phrase Meaning (existential) Class (philosophy) Natural (archaeology) Task (project management) Natural language Computer science Artificial intelligence Image (mathematics) Natural language processing Linguistics Psychology Cognitive psychology Philosophy History

Metrics

13
Cited By
1.02
FWCI (Field Weighted Citation Impact)
53
Refs
0.78
Citation Normalized Percentile
Is in top 1%
Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications
Physical Sciences →  Computer Science →  Computer Vision and Pattern Recognition
Explainable Artificial Intelligence (XAI)
Physical Sciences →  Computer Science →  Artificial Intelligence
Domain Adaptation and Few-Shot Learning
Physical Sciences →  Computer Science →  Artificial Intelligence
© 2026 ScienceGate Book Chapters — All rights reserved.