Do Clinical Models Change Treatment Decisions?

📅 2026-05-27

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

This work addresses the critical limitation of current clinical large language models in dynamically adapting treatment decisions when patient contexts change—a capability inadequately assessed by conventional medical question-answering benchmarks. To bridge this gap, the authors propose ClinPivot, the first auditable evaluation benchmark specifically designed to measure dynamic adaptability in therapeutic decision-making. ClinPivot leverages a biomedical knowledge graph to construct interpretable context-perturbation pairs and incorporates structured decision supervision alongside a lightweight replay mechanism to enhance contextual sensitivity under constrained knowledge budgets. Experimental results reveal that state-of-the-art models, including Qwen variants, perform poorly on ClinPivot, whereas the proposed approach significantly improves dynamic decision-making without compromising general assistant capabilities, thereby exposing a notable disconnect between standard medical QA accuracy and genuine clinical reasoning proficiency.

📝 Abstract

Clinical foundation models are evaluated with factual or exam-style medical QA, but treatment decisions must change when patient context changes. We introduce ClinPivot, an auditable treatment-decision benchmark built from biomedical relations and pivoted patient contexts. ClinPivot asks whether models change treatment choices when new clinical constraints shift the action space. We find that strong medical QA performance does not reliably predict decision-making performance: frontier models and task-adapted Qwen variants often fail to change decisions correctly, and model rankings shift across evaluation regimes. Decision-structured supervision improves pivot-sensitive decision-making and medical QA under matched knowledge budgets, while lightweight replay reduces losses in general assistant ability.

Problem

Research questions and friction points this paper is trying to address.

clinical decision-making

treatment decisions

foundation models

patient context

medical QA

Innovation

Methods, ideas, or system contributions that make the work stand out.

ClinPivot

treatment decision

clinical foundation models