Good Enough: Is it Worth Improving your Label Quality?

📅 2025-05-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
High-quality annotations in medical image segmentation are costly to acquire, yet their practical benefits remain unclear. Method: We systematically evaluate how label quality affects model performance and pretraining efficacy by generating multi-tier pseudo-label CT datasets using nnU-Net, TotalSegmentator, and MedSAM; we conduct controlled experiments, cross-quality attribution analysis, and pretraining ablation studies. Contribution/Results: We identify, for the first time, a distinct performance threshold: in-domain segmentation accuracy improves significantly only when label quality exceeds a minimal threshold; conversely, pretraining efficacy remains largely invariant to label quality, indicating models rely more on general anatomical priors than fine-grained annotations. Consequently, low-quality pseudo-labels suffice for effective pretraining; manual refinement yields measurable gains only when downstream tasks demand high boundary precision *and* label quality crosses this critical threshold—providing empirical guidance for optimal annotation resource allocation.

Technology Category

Application Category

📝 Abstract
Improving label quality in medical image segmentation is costly, but its benefits remain unclear. We systematically evaluate its impact using multiple pseudo-labeled versions of CT datasets, generated by models like nnU-Net, TotalSegmentator, and MedSAM. Our results show that while higher-quality labels improve in-domain performance, gains remain unclear if below a small threshold. For pre-training, label quality has minimal impact, suggesting that models rather transfer general concepts than detailed annotations. These findings provide guidance on when improving label quality is worth the effort.
Problem

Research questions and friction points this paper is trying to address.

Evaluating impact of label quality on medical image segmentation
Assessing benefits of high-quality labels for in-domain performance
Exploring label quality influence on model pre-training outcomes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluate label quality impact systematically
Use pseudo-labeled CT datasets
Minimal impact on pre-training
🔎 Similar Papers
No similar papers found.