Evaluating Object Hallucination in Large Vision-Language Models

Yifan Li; Yifan Du; Kun Zhou; Jinpeng Wang; Zhao Xin; Ji-Rong Wen

doi:10.18653/v1/2023.emnlp-main.20

ScienceGate Book Chapters

JOURNAL ARTICLE

Evaluating Object Hallucination in Large Vision-Language Models

Yifan Li Yifan Du Kun Zhou Jinpeng Wang Zhao Xin Ji-Rong Wen

Year: 2023 Pages: 292-305

DOI: 10.18653/v1/2023.emnlp-main.20

Get Full-Text PDF Get Analytical Report

Abstract

Inspired by the superior language abilities of large language models (LLM), large vision-language models (LVLM) have been recently proposed by integrating powerful LLMs for improving the performance on complex multimodal tasks. Despite the promising progress on LVLMs, we find that they suffer from object hallucinations, i.e., they tend to generate objects inconsistent with the target images in the descriptions. To investigate it, this work presents the first systematic study on object hallucination of LVLMs. We conduct the evaluation experiments on several representative LVLMs, and show that they mostly suffer from severe object hallucination issues. We further discuss that the visual instructions may influence the hallucination, and find that: objects that frequently appear in the visual instructions or co-occur with the image objects are obviously prone to be hallucinated by LVLMs. Besides, we further design a polling-based query method called POPE for better evaluation of object hallucination. Experiment results show that our POPE can evaluate object hallucination in a more stable and flexible way.

Keywords:

Hallucinating Object (grammar) Computer science Visual Hallucination Artificial intelligence Polling Computer vision Natural language processing Cognitive psychology Psychology

Metrics

264

Cited By

48.04

FWCI (Field Weighted Citation Impact)

Refs

1.00

Citation Normalized Percentile

Is in top 1%

Is in top 10%

Citation History

Topics

Multimodal Machine Learning Applications

Physical Sciences → Computer Science → Computer Vision and Pattern Recognition

Topic Modeling

Physical Sciences → Computer Science → Artificial Intelligence

Evaluating Object Hallucination in Large Vision-Language Models

Abstract

Metrics

Citation History

Topics

Related Documents

Multi-Object Hallucination in Vision Language Models

Object Hallucination Detection in Large Vision Language Models via Evidential Conflict

Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models?

Asking Questions to Alleviate Object Hallucination in Large Vision-Language Models

Evaluating and Mitigating Object Hallucination in Large Vision-Language Models: Can They Still See Removed Objects?