Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models

📅 2023-09-03

🏛️ Computational Linguistics

📈 Citations: 650

✨ Influential: 24

career value

212K/year

🤖 AI Summary

Hallucination—i.e., generation inconsistent with input, context, or factual knowledge—severely undermines the reliability of large language models (LLMs) in real-world applications. To address this, we systematically analyze hallucination causes and propose, for the first time, a multidimensional taxonomy tailored to LLMs, encompassing input consistency, contextual coherence, and factual alignment, alongside a unified cross-task evaluation benchmark. Synthesizing insights from over 120 studies, we integrate and comparatively assess major mitigation strategies—including confidence calibration, retrieval-augmented generation (RAG), self-verification, contrastive decoding, and knowledge graph alignment—revealing their empirical limitations in open-domain question answering and long-horizon reasoning. Our work establishes a novel paradigm of synergistic governance via interpretable intervention and knowledge enhancement, offering both a theoretical framework and practical guidelines for developing trustworthy LLMs.

📝 Abstract

While large language models (LLMs) have demonstrated remarkable capabilities across a range of downstream tasks, a significant concern revolves around their propensity to exhibit hallucinations: LLMs occasionally generate content that diverges from the user input, contradicts previously generated context, or misaligns with established world knowledge. This phenomenon poses a substantial challenge to the reliability of LLMs in real-world scenarios. In this paper, we survey recent efforts on the detection, explanation, and mitigation of hallucination, with an emphasis on the unique challenges posed by LLMs. We present taxonomies of the LLM hallucination phenomena and evaluation benchmarks, analyze existing approaches aiming at mitigating LLM hallucination, and discuss potential directions for future research.

Problem

Research questions and friction points this paper is trying to address.

Addressing hallucination issues in large language models

Detecting and mitigating content divergence from user input

Improving reliability of LLMs in real-world applications

Innovation

Methods, ideas, or system contributions that make the work stand out.

Surveying hallucination detection methods

Taxonomizing LLM hallucination phenomena

Analyzing mitigation approaches benchmarks

🔎 Similar Papers

No similar papers found.