Training-Free Out-Of-Distribution Segmentation With Foundation Models

📅 2025-10-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses zero-shot out-of-distribution (OoD) region detection for semantic segmentation—without any training, fine-tuning, or OoD annotations. The proposed method leverages deep features from a pretrained InternImage-L backbone, models in-distribution feature structure via unsupervised K-Means clustering, and adaptively filters segmentation outputs using confidence scores from the decoder head. By exploiting inherent discriminative capabilities of general-purpose vision foundation models, it establishes the first systematic evidence that such models intrinsically encode OoD detectability. Evaluated on RoadAnomaly and ADE-OoD benchmarks, the approach achieves mean precision of 50.02% and 48.77%, respectively—substantially surpassing both supervised and unsupervised state-of-the-art baselines. This work introduces a lightweight, generalizable, and deployment-ready paradigm for OoD segmentation, eliminating reliance on task-specific training or external supervision.

Technology Category

Application Category

📝 Abstract
Detecting unknown objects in semantic segmentation is crucial for safety-critical applications such as autonomous driving. Large vision foundation models, includ- ing DINOv2, InternImage, and CLIP, have advanced visual representation learn- ing by providing rich features that generalize well across diverse tasks. While their strength in closed-set semantic tasks is established, their capability to detect out- of-distribution (OoD) regions in semantic segmentation remains underexplored. In this work, we investigate whether foundation models fine-tuned on segmen- tation datasets can inherently distinguish in-distribution (ID) from OoD regions without any outlier supervision. We propose a simple, training-free approach that utilizes features from the InternImage backbone and applies K-Means clustering alongside confidence thresholding on raw decoder logits to identify OoD clusters. Our method achieves 50.02 Average Precision on the RoadAnomaly benchmark and 48.77 on the benchmark of ADE-OoD with InternImage-L, surpassing several supervised and unsupervised baselines. These results suggest a promising direc- tion for generic OoD segmentation methods that require minimal assumptions or additional data.
Problem

Research questions and friction points this paper is trying to address.

Detecting unknown objects in semantic segmentation for autonomous driving
Exploring foundation models' capability to identify out-of-distribution regions
Developing training-free OoD segmentation without outlier supervision
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses InternImage backbone features for OoD detection
Applies K-Means clustering to identify outlier regions
Employs confidence thresholding on decoder logits
🔎 Similar Papers
No similar papers found.
L
Laith Nayal
Laboratory of Multimodal Research In Industry, AI Institute, Innopolis University
H
Hadi Salloum
Phystech School of Applied Mathematics and Computer Science, Moscow Institute of Physics and Technology, Institutsky lane 9, Dolgoprudny, Moscow region, 141700
Ahmad Taha
Ahmad Taha
Lecturer (Assistant Professor), University of Glasgow
Cyber-Physical Energy SystemsInternet of ThingsHealthcare Technologies
Yaroslav Kholodov
Yaroslav Kholodov
Full professor of Innopolis University
Data analysisIntelligent transportation systemsNumerical methodsApplied mathematics
Alexander Gasnikov
Alexander Gasnikov
Innopolis University
convex optimizationAI