🤖 AI Summary
To address the weak causal reasoning, frequent hallucinations, and poor interpretability of large language models (LLMs) in high-stakes, knowledge-intensive domains such as medicine and law, this paper proposes a causal-graph-enhanced retrieval-augmented generation (RAG) framework. Methodologically, it introduces a novel multi-stage path optimization mechanism that aligns causal graph structures with chain-of-thought (CoT) reasoning—comprising causal graph filtering, CoT-guided graph retrieval, and multi-hop path refinement—and integrates domain-specific medical knowledge graphs to instantiate a Graph RAG architecture. This approach overcomes key limitations of conventional graph-based retrieval, namely the neglect of causal semantics and misalignment between reasoning steps and graph traversal depth. Empirical evaluation on medical question answering demonstrates up to a 10% absolute improvement in accuracy, alongside substantial gains in answer interpretability, logical consistency, and cross-model adaptability.
📝 Abstract
In knowledge-intensive tasks, especially in high-stakes domains like medicine and law, it is critical not only to retrieve relevant information but also to provide causal reasoning and explainability. Large language models (LLMs) have achieved remarkable performance in natural language understanding and generation tasks. However, they often suffer from limitations such as difficulty in incorporating new knowledge, generating hallucinations, and explaining their reasoning process. To address these challenges, integrating knowledge graphs with Graph Retrieval-Augmented Generation (Graph RAG) has emerged as an effective solution. Traditional Graph RAG methods often rely on simple graph traversal or semantic similarity, which do not capture causal relationships or align well with the model's internal reasoning steps. This paper proposes a novel pipeline that filters large knowledge graphs to emphasize cause-effect edges, aligns the retrieval process with the model's chain-of-thought (CoT), and enhances reasoning through multi-stage path improvements. Experiments on medical question-answering tasks show consistent gains, with up to a 10% absolute improvement across multiple large language models (LLMs). This approach demonstrates the value of combining causal reasoning with stepwise retrieval, leading to more interpretable and logically grounded solutions for complex queries.