🤖 AI Summary
Addressing the challenge of aligning multiple capabilities—domain knowledge, reasoning, instruction following, and system integration—in biomedical large language models (LLMs), which often interfere with one another and lack sufficient clinical safety, this paper proposes BalancedBio. Our contributions are threefold: (1) the first formal Biomedical Multi-Capability Convergence Theorem; (2) a medical-knowledge-enhanced synthetic data generation method integrating clinical workflow constraints and ontology-based validation; and (3) capability-aware grouped relative policy optimization, combining rule-model hybrid reward shaping with Pareto-optimal analysis to ensure gradient orthogonality and clinical safety. Evaluated on a 0.5B-parameter model, BalancedBio achieves state-of-the-art performance: 80.95% domain knowledge accuracy, 61.94% reasoning score, 67.95% instruction-following fidelity, and 86.7% capability integration. In real-world deployment, it reduces operational costs by 78%, improves diagnostic accuracy by 23%, and attains 89% clinical acceptance.
📝 Abstract
BalancedBio is a theoretically grounded framework for parameter-efficient biomedical reasoning, addressing multi-capability integration in domain-specific AI alignment. It establishes the Biomedical Multi-Capability Convergence Theorem, proving orthogonal gradient spaces are essential to prevent capability interference for safe deployment. Key innovations include: (1) Medical Knowledge Grounded Synthetic Generation (MKGSG), extending Source2Synth with clinical workflow constraints and medical ontology validation for factual accuracy and safety; and (2) Capability Aware Group Relative Policy Optimization, deriving optimal hybrid reward weighting to maintain orthogonality in RL, using a reward model with rule-based and model-based scores adapted to biomedical tasks. Mathematical analysis proves Pareto-optimal convergence, preserving performance across capabilities. It achieves state-of-the-art results in its parameter class: domain expertise (80.95% BIOMED-MMLU, +15.32% over baseline), reasoning (61.94%, +7.75%), instruction following (67.95%, +6.44%), and integration (86.7%, +18.5%). Theoretical safety guarantees include bounds on capability preservation and clinical accuracy. Real-world deployment yields 78% cost reduction, 23% improved diagnostic accuracy, and 89% clinician acceptance. This work provides a principled methodology for biomedical AI alignment, enabling efficient reasoning with essential safety and reliability, with the 0.5B model version to be released.