Do Clinical Models Change Treatment Decisions?

πŸ“… 2026-05-27
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

189K/year
πŸ€– AI Summary
This work addresses the critical limitation of current clinical large language models in dynamically adapting treatment decisions when patient contexts changeβ€”a capability inadequately assessed by conventional medical question-answering benchmarks. To bridge this gap, the authors propose ClinPivot, the first auditable evaluation benchmark specifically designed to measure dynamic adaptability in therapeutic decision-making. ClinPivot leverages a biomedical knowledge graph to construct interpretable context-perturbation pairs and incorporates structured decision supervision alongside a lightweight replay mechanism to enhance contextual sensitivity under constrained knowledge budgets. Experimental results reveal that state-of-the-art models, including Qwen variants, perform poorly on ClinPivot, whereas the proposed approach significantly improves dynamic decision-making without compromising general assistant capabilities, thereby exposing a notable disconnect between standard medical QA accuracy and genuine clinical reasoning proficiency.
πŸ“ Abstract
Clinical foundation models are evaluated with factual or exam-style medical QA, but treatment decisions must change when patient context changes. We introduce ClinPivot, an auditable treatment-decision benchmark built from biomedical relations and pivoted patient contexts. ClinPivot asks whether models change treatment choices when new clinical constraints shift the action space. We find that strong medical QA performance does not reliably predict decision-making performance: frontier models and task-adapted Qwen variants often fail to change decisions correctly, and model rankings shift across evaluation regimes. Decision-structured supervision improves pivot-sensitive decision-making and medical QA under matched knowledge budgets, while lightweight replay reduces losses in general assistant ability.
Problem

Research questions and friction points this paper is trying to address.

clinical decision-making
treatment decisions
foundation models
patient context
medical QA
Innovation

Methods, ideas, or system contributions that make the work stand out.

ClinPivot
treatment decision
clinical foundation models
decision-structured supervision
pivot-sensitive evaluation
πŸ”Ž Similar Papers
No similar papers found.