🤖 AI Summary
BEV perception faces reliability bottlenecks in safety-critical scenarios—such as occlusion, adverse weather, and dynamic traffic—hindering its deployment in autonomous driving. Method: This paper presents the first systematic survey of BEV perception evolution from a functional safety perspective, categorizing advancements into three phases: unimodal, multimodal onboard, and multi-agent collaborative perception. It proposes a unified taxonomy for open-world challenges, identifying core issues including sensor degradation, unknown-class recognition, and low-latency coordination. The survey integrates multi-sensor fusion, open-set recognition, label-free learning, degradation-resilient modeling, and vehicle–road–cloud low-latency communication, while reviewing mainstream architectures and benchmark datasets. Contribution/Results: It clarifies critical technical bottlenecks and provides theoretical foundations and practical pathways for advancing BEV perception toward end-to-end autonomous driving, embodied intelligence, and large-model-augmented systems.
📝 Abstract
Bird's-Eye-View (BEV) perception has become a foundational paradigm in autonomous driving, enabling unified spatial representations that support robust multi-sensor fusion and multi-agent collaboration. As autonomous vehicles transition from controlled environments to real-world deployment, ensuring the safety and reliability of BEV perception in complex scenarios - such as occlusions, adverse weather, and dynamic traffic - remains a critical challenge. This survey provides the first comprehensive review of BEV perception from a safety-critical perspective, systematically analyzing state-of-the-art frameworks and implementation strategies across three progressive stages: single-modality vehicle-side, multimodal vehicle-side, and multi-agent collaborative perception. Furthermore, we examine public datasets encompassing vehicle-side, roadside, and collaborative settings, evaluating their relevance to safety and robustness. We also identify key open-world challenges - including open-set recognition, large-scale unlabeled data, sensor degradation, and inter-agent communication latency - and outline future research directions, such as integration with end-to-end autonomous driving systems, embodied intelligence, and large language models.