RT-SDGOD: Real-Time Single-Domain Generalized Object Detection

📅 2026-06-08

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the significant degradation in generalization performance of real-time object detectors under varying weather and imaging conditions. To tackle this issue without incurring additional inference overhead, the authors propose RT-SDGDet, a single-domain generalization framework specifically designed for real-time detection. The method constructs object-specific query groups via one-to-many supervision and enhances the sufficiency and stability of instance-level discriminative evidence through Discriminative Evidence Diversity Learning (DEDL) and Dual-view Evidence Consistency Learning (DvECL). As the first study to formally define and explore single-domain generalization for real-time detectors, RT-SDGDet substantially outperforms existing approaches across multiple unseen target domains, effectively reducing missed detections and significantly improving cross-domain generalization while maintaining zero added inference latency.

📝 Abstract

In real-world deployment under strict real-time constraints, weather and imaging variations induce significant distribution shifts, severely degrading detectors. Single-Domain Generalized Object Detection aims to mitigate this issue, yet existing methods rarely investigate-at the level of problem formulation-the generalization capability of real-time detectors under such constrained inference budgets. To this end, we introduce Real-Time Single-Domain Generalized Object Detection (RT-SDGOD), which focuses on how real-time detectors can achieve cross-domain generalization under zero extra inference overhead by relying solely on training-time representation learning. We observe that, under domain shift, DETR-based real-time detectors mainly degrade through increased missed detections, rooted in limited and unstable object-level discriminative evidence. Based on this, we propose RT-SDGDet, a multi-evidence collaborative modeling framework for RT-SDGOD. The core idea is to enable multiple queries of the same object to collaboratively cover more sufficient discriminative evidence while maintaining the stability of such evidence modeling across views. Specifically, we use one-to-many (O2M) supervision to construct stable object-specific query groups, and further design Discriminative Evidence Diversity Learning (DEDL) and Dual-view Evidence Consistency Learning (DvECL) to expand object-level evidence coverage and improve evidence stability under appearance perturbations, respectively. Since all components are introduced only during training, our method incurs no extra inference overhead. Extensive experiments show that the proposed method achieves better generalization performance than existing approaches across multiple unseen target domains.

Problem

Research questions and friction points this paper is trying to address.

Real-Time Object Detection

Domain Generalization

Distribution Shift

Cross-Domain Generalization

Inference Budget

Innovation

Methods, ideas, or system contributions that make the work stand out.

Real-Time Object Detection

Domain Generalization

DETR