Leveraging NTPs for Efficient Hallucination Detection in VLMs

📅 2025-09-20

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

Vision-language models (VLMs) frequently suffer from hallucination—generating text inconsistent with image content—undermining their reliability. To address this, we propose a lightweight, real-time hallucination detection method grounded in next-token probability (NTP). First, we systematically validate NTP as an effective internal uncertainty proxy within VLMs. Second, we construct a multi-source feature discriminator by fusing linguistic NTP scores, image-guided VLM prediction confidence, and model-generated hallucination likelihood scores. Third, we train a traditional machine learning classifier (e.g., XGBoost) on only 1,400 human-annotated samples, achieving detection accuracy comparable to strong VLM-based baselines; ensemble integration further improves performance. Our approach drastically reduces computational overhead and inference latency while maintaining high fidelity. It thus provides an efficient, practical solution for trustworthy VLM deployment in real-world applications.

Technology Category

Application Category

📝 Abstract

Hallucinations of vision-language models (VLMs), which are misalignments between visual content and generated text, undermine the reliability of VLMs. One common approach for detecting them employs the same VLM, or a different one, to assess generated outputs. This process is computationally intensive and increases model latency. In this paper, we explore an efficient on-the-fly method for hallucination detection by training traditional ML models over signals based on the VLM's next-token probabilities (NTPs). NTPs provide a direct quantification of model uncertainty. We hypothesize that high uncertainty (i.e., a low NTP value) is strongly associated with hallucinations. To test this, we introduce a dataset of 1,400 human-annotated statements derived from VLM-generated content, each labeled as hallucinated or not, and use it to test our NTP-based lightweight method. Our results demonstrate that NTP-based features are valuable predictors of hallucinations, enabling fast and simple ML models to achieve performance comparable to that of strong VLMs. Furthermore, augmenting these NTPs with linguistic NTPs, computed by feeding only the generated text back into the VLM, enhances hallucination detection performance. Finally, integrating hallucination prediction scores from VLMs into the NTP-based models led to better performance than using either VLMs or NTPs alone. We hope this study paves the way for simple, lightweight solutions that enhance the reliability of VLMs.

Problem

Research questions and friction points this paper is trying to address.

Detecting hallucinations in vision-language models efficiently

Reducing computational cost of hallucination detection methods

Using next-token probabilities to quantify model uncertainty

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses next-token probabilities to quantify model uncertainty

Trains lightweight ML models on NTP features for detection

Augments visual NTPs with linguistic NTPs for better performance

🔎 Similar Papers

No similar papers found.