Quickly Tuning Foundation Models for Image Segmentation

📅 2025-08-24

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

To address the limited zero-shot segmentation performance of foundation models (e.g., SAM) on domain-specific images and the high cost of manual fine-tuning, this paper proposes QTT-SEG—the first framework integrating meta-learning into automated SAM adaptation. QTT-SEG jointly models performance and computational cost by unifying meta-learning, AutoML, and large-scale hyperparameter configuration search, enabling identification of the optimal fine-tuning strategy within minutes (≤3 min) across a search space exceeding 200 million configurations. Fully automated and requiring no domain expert intervention, it supports both binary and multi-class segmentation tasks. On eight binary and five multi-class benchmarks, QTT-SEG significantly outperforms SAM’s zero-shot baseline and surpasses AutoGluon Multimodal on most tasks, demonstrating its efficiency, generality, and automation capability.

Technology Category

Application Category

📝 Abstract

Foundation models like SAM (Segment Anything Model) exhibit strong zero-shot image segmentation performance, but often fall short on domain-specific tasks. Fine-tuning these models typically requires significant manual effort and domain expertise. In this work, we introduce QTT-SEG, a meta-learning-driven approach for automating and accelerating the fine-tuning of SAM for image segmentation. Built on the Quick-Tune hyperparameter optimization framework, QTT-SEG predicts high-performing configurations using meta-learned cost and performance models, efficiently navigating a search space of over 200 million possibilities. We evaluate QTT-SEG on eight binary and five multiclass segmentation datasets under tight time constraints. Our results show that QTT-SEG consistently improves upon SAM's zero-shot performance and surpasses AutoGluon Multimodal, a strong AutoML baseline, on most binary tasks within three minutes. On multiclass datasets, QTT-SEG delivers consistent gains as well. These findings highlight the promise of meta-learning in automating model adaptation for specialized segmentation tasks. Code available at: https://github.com/ds-brx/QTT-SEG/

Problem

Research questions and friction points this paper is trying to address.

Automating fine-tuning of foundation models for domain-specific image segmentation

Reducing manual effort and expertise needed for model adaptation

Accelerating hyperparameter optimization in large search spaces

Innovation

Methods, ideas, or system contributions that make the work stand out.

Meta-learning automates SAM fine-tuning

Hyperparameter optimization predicts configurations

Efficiently searches 200 million possibilities

🔎 Similar Papers

How to build the best medical image segmentation algorithm using foundation models: a comprehensive empirical study with Segment Anything Model