Expertise-aware Multi-LLM Recruitment and Collaboration for Medical Decision-Making

📅 2025-08-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current large language models (LLMs) exhibit limited clinical information integration capability in medical decision-making (MDM) due to rigid parametric knowledge and static training corpora. To address this, we propose a multi-LLM collaborative framework. Our method dynamically recruits optimal LLM agents based on a fine-grained professional capability taxonomy—categorized by medical specialty and question difficulty—and enhances decision robustness via three mechanisms: self-assessed confidence generation, multi-source confidence fusion, and adversarial validation. Extensive experiments on three public MDM benchmarks demonstrate significant improvements over both single-model baselines and existing multi-model approaches. Notably, our framework achieves 74.45% accuracy on MMLU-Pro-Health, outperforming GPT-4-0613 by 2.69 percentage points. The core contributions include (1) a structured, domain-aware LLM capability schema enabling adaptive agent selection; (2) a confidence-aware, adversarially validated consensus mechanism; and (3) empirical validation of substantial gains in clinical reasoning fidelity under realistic MDM evaluation protocols.

Technology Category

Application Category

📝 Abstract
Medical Decision-Making (MDM) is a complex process requiring substantial domain-specific expertise to effectively synthesize heterogeneous and complicated clinical information. While recent advancements in Large Language Models (LLMs) show promise in supporting MDM, single-LLM approaches are limited by their parametric knowledge constraints and static training corpora, failing to robustly integrate the clinical information. To address this challenge, we propose the Expertise-aware Multi-LLM Recruitment and Collaboration (EMRC) framework to enhance the accuracy and reliability of MDM systems. It operates in two stages: (i) expertise-aware agent recruitment and (ii) confidence- and adversarial-driven multi-agent collaboration. Specifically, in the first stage, we use a publicly available corpus to construct an LLM expertise table for capturing expertise-specific strengths of multiple LLMs across medical department categories and query difficulty levels. This table enables the subsequent dynamic selection of the optimal LLMs to act as medical expert agents for each medical query during the inference phase. In the second stage, we employ selected agents to generate responses with self-assessed confidence scores, which are then integrated through the confidence fusion and adversarial validation to improve diagnostic reliability. We evaluate our EMRC framework on three public MDM datasets, where the results demonstrate that our EMRC outperforms state-of-the-art single- and multi-LLM methods, achieving superior diagnostic performance. For instance, on the MMLU-Pro-Health dataset, our EMRC achieves 74.45% accuracy, representing a 2.69% improvement over the best-performing closed-source model GPT- 4-0613, which demonstrates the effectiveness of our expertise-aware agent recruitment strategy and the agent complementarity in leveraging each LLM's specialized capabilities.
Problem

Research questions and friction points this paper is trying to address.

Addressing limitations of single-LLM approaches in medical decision-making
Integrating heterogeneous clinical information through multi-agent collaboration
Enhancing diagnostic accuracy and reliability in healthcare systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

Expertise-aware recruitment of multiple LLMs
Confidence-driven fusion for diagnostic reliability
Adversarial validation to improve decision accuracy
🔎 Similar Papers
No similar papers found.