🤖 AI Summary
To address the challenge of balancing low latency and high accuracy in edge-deployed 3D object detection for autonomous driving, this paper proposes a scene-aware Mixture-of-Experts (MoE) collaborative detection framework tailored for edge computing. Methodologically, it introduces an adaptive multimodal bridging mechanism to fuse sparse LiDAR point clouds with dense camera image features; incorporates a dynamic expert routing strategy conditioned on object visibility and distance; and jointly optimizes computational graph simplification and hardware resource utilization for software–hardware co-design. To our knowledge, this is the first work to adapt the MoE paradigm to real-time edge-based 3D detection. Evaluated on KITTI, the framework achieves a 3.58% mAP improvement and a 159.06% inference speedup on Jetson platforms. On nuScenes, it demonstrates strong generalization, significantly enhancing both real-time performance and detection accuracy on resource-constrained edge devices.
📝 Abstract
This paper presents Edge-based Mixture of Experts (MoE) Collaborative Computing (EMC2), an optimal computing system designed for autonomous vehicles (AVs) that simultaneously achieves low-latency and high-accuracy 3D object detection. Unlike conventional approaches, EMC2 incorporates a scenario-aware MoE architecture specifically optimized for edge platforms. By effectively fusing LiDAR and camera data, the system leverages the complementary strengths of sparse 3D point clouds and dense 2D images to generate robust multimodal representations. To enable this, EMC2 employs an adaptive multimodal data bridge that performs multi-scale preprocessing on sensor inputs, followed by a scenario-aware routing mechanism that dynamically dispatches features to dedicated expert models based on object visibility and distance. In addition, EMC2 integrates joint hardware-software optimizations, including hardware resource utilization optimization and computational graph simplification, to ensure efficient and real-time inference on resource-constrained edge devices. Experiments on open-source benchmarks clearly show the EMC2 advancements as a end-to-end system. On the KITTI dataset, it achieves an average accuracy improvement of 3.58% and a 159.06% inference speedup compared to 15 baseline methods on Jetson platforms, with similar performance gains on the nuScenes dataset, highlighting its capability to advance reliable, real-time 3D object detection tasks for AVs.