Exploring Patient Data Requirements in Training Effective AI Models for MRI-Based Breast Cancer Classification

📅 2025-02-22
🏛️ Deep-Breath@MICCAI
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the problem of determining the minimal clinically feasible dataset size required to train MRI-based AI models for breast cancer classification. Moving beyond conventional image-count–based evaluations, we propose a patient-level data requirements analysis framework and introduce the novel metric “effective patient count.” Using a multicenter MRI dataset, our methodology integrates few-shot learning, cross-site robustness evaluation, and uncertainty-driven data importance ranking to quantitatively assess how dataset size, lesion diversity, and annotation quality impact model generalizability. Results demonstrate that only 80–120 high-quality, expert-annotated patients suffice for models to achieve >92% AUC on external multi-institutional validation—substantially lowering the data acquisition barrier for clinical deployment. Our core contribution is the establishment of a reproducible, patient-centric data efficiency evaluation paradigm, providing empirically grounded guidance for data curation and resource planning in medical imaging AI.

Technology Category

Application Category

Problem

Research questions and friction points this paper is trying to address.

Determine data quantity for AI training
Optimize MRI-based breast cancer detection
Assess impact of patient count on model performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes foundation models effectively
Trains AI with minimal MRI data
Enhances performance with simple ensembles
🔎 Similar Papers
No similar papers found.
S
Solha Kang
Center for Biosystems and Biotech Data Science, Ghent University Global Campus, Republic of Korea
W
W. D. Neve
Center for Biosystems and Biotech Data Science, Ghent University Global Campus, Republic of Korea; Department of Electronics and Information Systems, Ghent University, Belgium
François Rameau
François Rameau
Assistant Professor of Computer Science, The State University of New York - SUNY Korea
Computer Vision
Utku Ozbulak
Utku Ozbulak
Research Professor at Ghent University
Trustworthy AIMedical imagingBiomedical imagingSelf-supervised learning