Hydra: Structured Cross-Source Enhanced Large Language Model Reasoning

📅 2025-05-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address key challenges in hybrid retrieval-augmented generation (RAG)—including difficulty in multi-hop reasoning, weak multi-entity association, low credibility of multi-source evidence, and underutilization of knowledge graphs—this paper proposes Hydra, a training-free framework. Methodologically, Hydra introduces: (1) a novel three-factor cross-source verification mechanism—comprising source credibility assessment, cross-source mutual corroboration, and entity path alignment; (2) graph-structured guidance for agent-based hybrid retrieval with early noise pruning; and (3) integrated knowledge graph topological modeling, heterogeneous semantic alignment, and cross-modal consistency verification. Evaluated across seven benchmarks, Hydra consistently outperforms state-of-the-art methods: it achieves an average 20.3% improvement over GPT-3.5 (up to +30.1%) and matches GPT-4-Turbo-level reasoning performance using only Llama-3.1-8B.

Technology Category

Application Category

📝 Abstract
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating external knowledge. Current hybrid RAG system retrieves evidence from both knowledge graphs (KGs) and text documents to support LLM reasoning. However, it faces challenges like handling multi-hop reasoning, multi-entity questions, multi-source verification, and effective graph utilization. To address these limitations, we present Hydra, a training-free framework that unifies graph topology, document semantics, and source reliability to support deep, faithful reasoning in LLMs. Hydra handles multi-hop and multi-entity problems through agent-driven exploration that combines structured and unstructured retrieval, increasing both diversity and precision of evidence. To tackle multi-source verification, Hydra uses a tri-factor cross-source verification (source trustworthiness assessment, cross-source corroboration, and entity-path alignment), to balance topic relevance with cross-modal agreement. By leveraging graph structure, Hydra fuses heterogeneous sources, guides efficient exploration, and prunes noise early. Comprehensive experiments on seven benchmark datasets show that Hydra achieves overall state-of-the-art results on all benchmarks with GPT-3.5, outperforming the strong hybrid baseline ToG-2 by an average of 20.3% and up to 30.1%. Furthermore, Hydra enables smaller models (e.g., Llama-3.1-8B) to achieve reasoning performance comparable to that of GPT-4-Turbo.
Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM reasoning with multi-source knowledge retrieval
Addressing multi-hop and multi-entity reasoning challenges
Improving cross-source verification for reliable evidence integration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Unifies graph topology, document semantics, source reliability
Agent-driven exploration combines structured, unstructured retrieval
Tri-factor cross-source verification ensures reliable reasoning
🔎 Similar Papers
No similar papers found.
Xingyu Tan
Xingyu Tan
University of New South Wales
Graph ProcessingDatabaseLLMsKnowledge Graph
X
Xiaoyang Wang
University of New South Wales
Q
Qing Liu
Data61, CSIRO
X
Xiwei Xu
Data61, CSIRO
X
Xin Yuan
Data61, CSIRO
Liming Zhu
Liming Zhu
Research Director at CSIRO’s Data61 & Prof at University of New South Wales
Software ArchitectureSE4AIResponsible AIAI SafetyBlockchain
W
Wenjie Zhang
University of New South Wales