Baichuan-M4: A Clinical-Grade Medical Agent System for Continuous Care

📅 2026-06-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of conventional large medical language models, which typically support only single-turn question answering and lack capabilities for continuous patient care. To overcome this, the authors propose a clinical-grade agent system designed specifically for longitudinal medical care. The system integrates a unified runtime, a core reasoning model, and a clinical tool layer, and introduces a novel reinforcement learning framework for continuous care that combines SPAR++ reward modeling, reasoning path compression, and curriculum learning, while supporting multi-agent collaboration with action constraints. Leveraging long-context patient memory management, evidence-based retrieval, multimodal medical perception (including X-rays, dermatological images, and document OCR), and the Baichuan-Harness unified runtime, the system achieves state-of-the-art performance across static knowledge, dynamic interviewing, clinical memory, evidence retrieval, and multimodal understanding, reducing hallucination rates to 3.3%.
📝 Abstract
Baichuan-M4 is Baichuan Intelligence's clinical-grade medical large model, designed for \emph{continuous care} rather than single-turn medical question answering. It is built as a coordinated medical agent system around three pillars: \textbf{Baichuan-Harness}, a unified runtime that keeps reinforcement-learning training and real-world deployment consistent while enforcing action constraints, tool use, long-term patient memory, and multi-agent coordination; a \textbf{core reasoning model} trained with a continuous-care reinforcement-learning framework that integrates span-level reward modeling (SPAR++), reasoning-path compression, curriculum learning, and stabilized policy optimization; and a \textbf{clinical tool layer} for patient-memory management, authoritative evidence-based retrieval, and multimodal medical perception across documents, X-rays, and dermatology. On a cross-dimensional medical evaluation suite, Baichuan-M4 attains leading results in static medical knowledge and safety, dynamic OSCE-style consultation, long-context clinical memory, evidence-based retrieval, medical document OCR, and multimodal image understanding, while lowering the hallucination rate to 3.3\%.
Problem

Research questions and friction points this paper is trying to address.

continuous care
medical large language model
clinical-grade AI
multimodal medical understanding
long-term patient memory
Innovation

Methods, ideas, or system contributions that make the work stand out.

continuous care
reinforcement learning
multi-agent coordination
multimodal medical perception
clinical-grade LLM
A
Aiyuan Yang
Baichuan AI and THUBPM Group, Tsinghua University
C
Chengfeng Dou
Baichuan AI and THUBPM Group, Tsinghua University
D
Da Pan
Baichuan AI and THUBPM Group, Tsinghua University
Dian Wang
Dian Wang
Stanford University
Robot LearningRoboticsMachine LearningGeometric Deep LearningReinforcement Learning
Fan Yang
Fan Yang
Tsinghua University
MathematicsProbabilityStatistics
Fei Deng
Fei Deng
Research Scientist, Google
Diffusion ModelsRLHFReinforcement LearningGenerative ModelsObject-Centric Learning
F
Fei Li
Baichuan AI and THUBPM Group, Tsinghua University
G
Guangwei Ai
Baichuan AI and THUBPM Group, Tsinghua University
H
Hui Liu
Baichuan AI and THUBPM Group, Tsinghua University
H
Hongda Zhang
Baichuan AI and THUBPM Group, Tsinghua University
J
Jinyang Tai
Baichuan AI and THUBPM Group, Tsinghua University
K
Kai Lu
Baichuan AI and THUBPM Group, Tsinghua University
L
Lijun Liu
Baichuan AI and THUBPM Group, Tsinghua University
L
Linwei Chen
Baichuan AI and THUBPM Group, Tsinghua University
Linyu Li
Linyu Li
Peking University
knowledge graphai4science
M
Meiqing Guo
Baichuan AI and THUBPM Group, Tsinghua University
P
Peidong Guo
Baichuan AI and THUBPM Group, Tsinghua University
Q
Qiang Ju
Baichuan AI and THUBPM Group, Tsinghua University
R
Rihui Xin
Baichuan AI and THUBPM Group, Tsinghua University
S
Shuai Wang
Baichuan AI and THUBPM Group, Tsinghua University
X
XinKai Ma
Baichuan AI and THUBPM Group, Tsinghua University
X
Xudong Chen
Baichuan AI and THUBPM Group, Tsinghua University
Yichuan Mo
Yichuan Mo
Ph.D. Candidate, Peking University
Trustworthy AITrustworthy LLMTrustworthy Diffusion Model
C
Canbin Piao
Baichuan AI and THUBPM Group, Tsinghua University
Leyi Pan
Leyi Pan
Tsinghua University
LLMAI SafetyWatermarkPost Training