CoopDiff: A Diffusion-Guided Approach for Cooperation under Corruptions

📅 2026-03-02

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This work addresses the significant degradation in robustness and generalization of collaborative perception systems caused by diverse and unpredictable corruptions in real-world sensing data. To this end, we propose CoopDiff—the first framework to integrate diffusion models into collaborative perception—leveraging a denoising mechanism to enhance performance under various corruption types. Our approach adopts a teacher–student paradigm: the teacher module generates clean supervision signals by fusing voxel-level features weighted by Quality of Interest, while the student module employs a dual-branch diffusion architecture with an Ego-Guided Cross-Attention mechanism to enable adaptive feature reconstruction under degraded conditions. Evaluated on the OPV2Vn and DAIR-V2Xn benchmarks, CoopDiff substantially outperforms existing methods, significantly reducing relative corruption error and offering a flexible trade-off between accuracy and inference efficiency.

Technology Category

Application Category

📝 Abstract

Cooperative perception lets agents share information to expand coverage and improve scene understanding. However, in real-world scenarios, diverse and unpredictable corruptions undermine its robustness and generalization. To address these challenges, we introduce CoopDiff, a diffusion-based cooperative perception framework that mitigates corruptions via a denoising mechanism. CoopDiff adopts a teacher-student paradigm: the Quality-Aware Teacher performs voxel-level early fusion with Quality of Interest weighting and semantic guidance, then produces clean supervision features via a diffusion denoiser. The Dual-Branch Diffusion Student first separates ego and cooperative streams in encoding to reconstruct the teacher's clean targets. And then, an Ego-Guided Cross-Attention mechanism facilitates balanced decoding under degradation by adaptively integrating ego and cooperative features. We evaluate CoopDiff on two constructed multi-degradation benchmarks, OPV2Vn and DAIR-V2Xn, each incorporating six corruption types, including environmental and sensor-level distortions. Benefiting from the inherent denoising properties of diffusion, CoopDiff consistently outperforms prior methods across all degradation types and lowers the relative corruption error. Furthermore, it offers a tunable balance between precision and inference efficiency.

Problem

Research questions and friction points this paper is trying to address.

cooperative perception

corruptions

robustness

generalization

degradation

Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion-based denoising

cooperative perception

teacher-student paradigm