DeepRAG: Integrating Hierarchical Reasoning and Process Supervision for Biomedical Multi-Hop QA

πŸ“… 2025-05-31
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the challenges of complex query parsing and low domain-specific accuracy in biomedical multi-hop question answering (MedHopQA), this paper proposes a retrieval-augmented generation framework that synergistically combines hierarchical decomposition and process-level supervision. Methodologically: (1) a hierarchical question decomposer, built upon DeepSeek, explicitly models multi-hop reasoning paths; (2) RAG Gym is integrated to enable joint optimization of retrieval and generation; (3) for the first time, UMLS ontology-driven concept-level reward signals are introduced, enabling fine-grained semantic alignment and process supervision via reinforcement learning. Experiments on MedHopQA demonstrate substantial improvements over DeepSeek-V2 and RAG Gym baselines: +12.7% in Exact Match and +15.3% in concept-level accuracy. These results validate the framework’s effectiveness in precise biomedical semantic modeling and interpretable, stepwise reasoning.

Technology Category

Application Category

πŸ“ Abstract
We propose DeepRAG, a novel framework that integrates DeepSeek hierarchical question decomposition capabilities with RAG Gym unified retrieval-augmented generation optimization using process level supervision. Targeting the challenging MedHopQA biomedical question answering task, DeepRAG systematically decomposes complex queries into precise sub-queries and employs concept level reward signals informed by the UMLS ontology to enhance biomedical accuracy. Preliminary evaluations on the MedHopQA dataset indicate that DeepRAG significantly outperforms baseline models, including standalone DeepSeek and RAG Gym, achieving notable improvements in both Exact Match and concept level accuracy.
Problem

Research questions and friction points this paper is trying to address.

Enhances biomedical multi-hop QA accuracy
Decomposes complex queries into precise sub-queries
Improves retrieval-augmented generation with process supervision
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical question decomposition for precise sub-queries
Process supervision with UMLS ontology rewards
Unified RAG optimization via RAG Gym
πŸ”Ž Similar Papers
No similar papers found.
Yuelyu Ji
Yuelyu Ji
University of Pittsburgh
Natural language processingHealth information detectionLarge language model evaluation
H
Hang Zhang
Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
S
Shiven Verma
Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
H
Hui Ji
Department of Information Science, University of Pittsburgh, Pittsburgh, PA, USA
Chun Li
Chun Li
MD Anderson Cancer Center
diagnostic imagingdrug deliverynanotechnology
Y
Yushui Han
Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
Y
Yanshan Wang
Department of Health Information Management and ISP, University of Pittsburgh, Pittsburgh, PA, USA