DEPTH: Hallucination-Free Relation Extraction via Dependency-Aware Sentence Simplification and Two-tiered Hierarchical Refinement

📅 2025-08-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) suffer from hallucination in relation extraction—particularly under complex syntactic and semantic conditions—leading to spurious relation predictions that corrupt knowledge graphs and undermine downstream task reliability. To address this, we propose a dependency-aware sentence simplification framework coupled with a two-tier hierarchical refinement strategy: (1) local relation modeling via shortest dependency path extraction, and (2) global consistency correction guided by a causal-aware reward model that integrates reinforcement learning with human feedback to mitigate reward hacking. Our approach significantly enhances robustness and interpretability in relation judgment. Evaluated on six benchmark datasets, it reduces average hallucination rate to 7.0% and improves F1 score by 17.2% over state-of-the-art methods. This work establishes a trustworthy, controllable paradigm for LLM-based relation extraction.

Technology Category

Application Category

📝 Abstract
Relation extraction enables the construction of structured knowledge for many downstream applications. While large language models (LLMs) have shown great promise in this domain, most existing methods concentrate on relation classification, which predicts the semantic relation type between a related entity pair. However, we observe that LLMs often struggle to reliably determine whether a relation exists, especially in cases involving complex sentence structures or intricate semantics, which leads to spurious predictions. Such hallucinations can introduce noisy edges in knowledge graphs, compromising the integrity of structured knowledge and downstream reliability. To address these challenges, we propose DEPTH, a framework that integrates Dependency-aware sEntence simPlification and Two-tiered Hierarchical refinement into the relation extraction pipeline. Given a sentence and its candidate entity pairs, DEPTH operates in two stages: (1) the Grounding module extracts relations for each pair by leveraging their shortest dependency path, distilling the sentence into a minimal yet coherent relational context that reduces syntactic noise while preserving key semantics; (2) the Refinement module aggregates all local predictions and revises them based on a holistic understanding of the sentence, correcting omissions and inconsistencies. We further introduce a causality-driven reward model that mitigates reward hacking by disentangling spurious correlations, enabling robust fine-tuning via reinforcement learning with human feedback. Experiments on six benchmarks demonstrate that DEPTH reduces the average hallucination rate to 7.0% while achieving a 17.2% improvement in average F1 score over state-of-the-art baselines.
Problem

Research questions and friction points this paper is trying to address.

Reducing relation extraction hallucinations in complex sentences
Eliminating spurious predictions from LLMs in knowledge graphs
Addressing syntactic noise and semantic intricacies in RE
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dependency-aware sentence simplification for relational context
Two-tiered hierarchical refinement of local predictions
Causality-driven reward model for robust fine-tuning
Yupei Yang
Yupei Yang
Shanghai Jiao Tong University
CausalityReinforcement Learning
F
Fan Feng
University of California San Diego
L
Lin Yang
Alibaba Group
W
Wanxi Deng
Alibaba Group
L
Lin Qu
Alibaba Group
Biwei Huang
Biwei Huang
UCSD
CausalityMachine LearningComputational Science
S
Shikui Tu
Shanghai Jiao Tong University
L
Lei Xu
Shanghai Jiao Tong University