🤖 AI Summary
Addressing the dual challenges of bandwidth constraints and lack of theoretical foundations in multi-agent collaborative perception, this paper establishes a rate-distortion optimization framework grounded in information theory—providing the first principled, task-driven characterization of the communication-performance trade-off. We propose a task-entropy-based discrete encoding scheme and a mutual-information-driven message selection mechanism to achieve semantic-aware, redundancy-free compression and transmission of visual features. By integrating discrete feature representations with neural mutual information estimation, we design an end-to-end, low-overhead collaborative perception system. Evaluated on DAIR-V2X and OPV2V benchmarks, our method achieves state-of-the-art performance in both 3D object detection and BEV segmentation, while reducing communication overhead by up to 108×. This demonstrates significant improvements in jointly optimizing communication efficiency and perception accuracy.
📝 Abstract
Collaborative perception emphasizes enhancing environmental understanding by enabling multiple agents to share visual information with limited bandwidth resources. While prior work has explored the empirical trade-off between task performance and communication volume, a significant gap remains in the theoretical foundation. To fill this gap, we draw on information theory and introduce a pragmatic rate-distortion theory for multi-agent collaboration, specifically formulated to analyze performance-communication trade-off in goal-oriented multi-agent systems. This theory concretizes two key conditions for designing optimal communication strategies: supplying pragmatically relevant information and transmitting redundancy-less messages. Guided by these two conditions, we propose RDcomm, a communication-efficient collaborative perception framework that introduces two key innovations: i) task entropy discrete coding, which assigns features with task-relevant codeword-lengths to maximize the efficiency in supplying pragmatic information; ii) mutual-information-driven message selection, which utilizes mutual information neural estimation to approach the optimal redundancy-less condition. Experiments on 3D object detection and BEV segmentation demonstrate that RDcomm achieves state-of-the-art accuracy on DAIR-V2X and OPV2V, while reducing communication volume by up to 108 times. The code will be released.