The Framework That Survives Bad Models: Human-AI Collaboration For Clinical Trials

📅 2025-10-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the critical problem that AI model degradation—such as degenerating to random guessing or naive prediction—in clinical trials can severely compromise the reliability of treatment effect estimation. To mitigate this risk, we propose the “AI-as-Supportive-Reader” (AI-SR) human–AI collaboration framework, which embeds AI as an assistive, non-replacement component within radiographic assessment workflows, ensuring human oversight and decision-making dominance under severe AI failure. Evaluated in a randomized controlled trial using spinal X-ray images, AI-SR achieves high diagnostic accuracy while significantly improving robustness and cross-population generalizability. Compared to fully manual assessment and end-to-end AI approaches, AI-SR demonstrates superior cost-efficiency, operational stability, and consistency of trial conclusions. Crucially, even when AI performance degrades to near-random levels, AI-SR preserves unbiased estimation of treatment effects, thereby safeguarding the validity of trial outcomes.

Technology Category

Application Category

📝 Abstract
Artificial intelligence (AI) holds great promise for supporting clinical trials, from patient recruitment and endpoint assessment to treatment response prediction. However, deploying AI without safeguards poses significant risks, particularly when evaluating patient endpoints that directly impact trial conclusions. We compared two AI frameworks against human-only assessment for medical image-based disease evaluation, measuring cost, accuracy, robustness, and generalization ability. To stress-test these frameworks, we injected bad models, ranging from random guesses to naive predictions, to ensure that observed treatment effects remain valid even under severe model degradation. We evaluated the frameworks using two randomized controlled trials with endpoints derived from spinal X-ray images. Our findings indicate that using AI as a supporting reader (AI-SR) is the most suitable approach for clinical trials, as it meets all criteria across various model types, even with bad models. This method consistently provides reliable disease estimation, preserves clinical trial treatment effect estimates and conclusions, and retains these advantages when applied to different populations.
Problem

Research questions and friction points this paper is trying to address.

Developing AI frameworks resilient to model degradation
Ensuring reliable clinical trial conclusions with bad AI models
Validating treatment effects under severe model performance drops
Innovation

Methods, ideas, or system contributions that make the work stand out.

Human-AI collaboration framework for clinical trials
AI as supporting reader with bad model tolerance
Preserves treatment effect estimates under model degradation
🔎 Similar Papers
No similar papers found.
Y
Yao Chen
Novartis Pharmaceuticals Corporation, NJ, USA
D
David Ohlssen
Novartis Pharmaceuticals Corporation, NJ, USA
A
Aimee Readie
Novartis Pharmaceuticals Corporation, NJ, USA
G
Gregory Ligozio
Novartis Pharmaceuticals Corporation, NJ, USA
R
Ruvie Martin
Novartis Pharmaceuticals Corporation, NJ, USA
Thibaud Coroller
Thibaud Coroller
Researcher @ Novartis
Data scienceMachine learningDeep survivalQuantitative ImagingRadiomics