🤖 AI Summary
Existing superpixel methods often sacrifice region regularity when leveraging deep learning for improved segmentation accuracy, thereby compromising interpretability and downstream applicability. This paper proposes SPAM, a framework that achieves high-accuracy, semantics-aware superpixel segmentation while preserving geometric regularity. Its key contributions are: (1) the first integration of large-scale, semantics-agnostic pre-trained models into superpixel generation, effectively decoupling semantic representation from structural modeling; (2) a joint semantic-structural optimization strategy that synergistically combines deep feature extraction with classical regularization to enhance boundary consistency; and (3) flexible support for arbitrary prior inputs, interactive object focusing, and adaptive refinement of uncertain regions. SPAM significantly outperforms state-of-the-art methods across multiple benchmarks, with both quantitative metrics and qualitative visualizations confirming its effectiveness. The code and models are publicly available.
📝 Abstract
Superpixels are widely used in computer vision to simplify image representation and reduce computational complexity. While traditional methods rely on low-level features, deep learning-based approaches leverage high-level features but also tend to sacrifice regularity of superpixels to capture complex objects, leading to accurate but less interpretable segmentations. In this work, we introduce SPAM (SuperPixel Anything Model), a versatile framework for segmenting images into accurate yet regular superpixels. We train a model to extract image features for superpixel generation, and at inference, we leverage a large-scale pretrained model for semantic-agnostic segmentation to ensure that superpixels align with object masks. SPAM can handle any prior high-level segmentation, resolving uncertainty regions, and is able to interactively focus on specific objects. Comprehensive experiments demonstrate that SPAM qualitatively and quantitatively outperforms state-of-the-art methods on segmentation tasks, making it a valuable and robust tool for various applications. Code and pre-trained models are available here: https://github.com/waldo-j/spam.