Quickly Tuning Foundation Models for Image Segmentation

📅 2025-08-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited zero-shot segmentation performance of foundation models (e.g., SAM) on domain-specific images and the high cost of manual fine-tuning, this paper proposes QTT-SEG—the first framework integrating meta-learning into automated SAM adaptation. QTT-SEG jointly models performance and computational cost by unifying meta-learning, AutoML, and large-scale hyperparameter configuration search, enabling identification of the optimal fine-tuning strategy within minutes (≤3 min) across a search space exceeding 200 million configurations. Fully automated and requiring no domain expert intervention, it supports both binary and multi-class segmentation tasks. On eight binary and five multi-class benchmarks, QTT-SEG significantly outperforms SAM’s zero-shot baseline and surpasses AutoGluon Multimodal on most tasks, demonstrating its efficiency, generality, and automation capability.

Technology Category

Application Category

📝 Abstract
Foundation models like SAM (Segment Anything Model) exhibit strong zero-shot image segmentation performance, but often fall short on domain-specific tasks. Fine-tuning these models typically requires significant manual effort and domain expertise. In this work, we introduce QTT-SEG, a meta-learning-driven approach for automating and accelerating the fine-tuning of SAM for image segmentation. Built on the Quick-Tune hyperparameter optimization framework, QTT-SEG predicts high-performing configurations using meta-learned cost and performance models, efficiently navigating a search space of over 200 million possibilities. We evaluate QTT-SEG on eight binary and five multiclass segmentation datasets under tight time constraints. Our results show that QTT-SEG consistently improves upon SAM's zero-shot performance and surpasses AutoGluon Multimodal, a strong AutoML baseline, on most binary tasks within three minutes. On multiclass datasets, QTT-SEG delivers consistent gains as well. These findings highlight the promise of meta-learning in automating model adaptation for specialized segmentation tasks. Code available at: https://github.com/ds-brx/QTT-SEG/
Problem

Research questions and friction points this paper is trying to address.

Automating fine-tuning of foundation models for domain-specific image segmentation
Reducing manual effort and expertise needed for model adaptation
Accelerating hyperparameter optimization in large search spaces
Innovation

Methods, ideas, or system contributions that make the work stand out.

Meta-learning automates SAM fine-tuning
Hyperparameter optimization predicts configurations
Efficiently searches 200 million possibilities
🔎 Similar Papers
No similar papers found.