🤖 AI Summary
Sparse and irregular point clouds from automotive 3D radar—caused by long wavelengths—hinder conventional grid- or sequence-based 3D object detection methods. To address this, we propose the Graph Query Network (GQN), which models radar perception as an object-adaptive graph structure and dynamically focuses on salient regions in the bird’s-eye view (BEV) space via learnable graph queries. Our key contributions are: (1) the EdgeFocus module for efficient edge-relation reasoning; and (2) the DeepContext Pooling module for multi-scale contextual aggregation, drastically reducing graph construction overhead. On the nuScenes benchmark, GQN achieves an 8.2% relative improvement in mean Average Precision (mAP) over state-of-the-art radar-based detectors, with up to 53% absolute mAP gain. Moreover, it reduces peak memory consumption for graph construction by 80%, while maintaining computationally feasible inference cost.
📝 Abstract
Object detection with 3D radar is essential for 360-degree automotive perception, but radar's long wavelengths produce sparse and irregular reflections that challenge traditional grid and sequence-based convolutional and transformer detectors. This paper introduces Graph Query Networks (GQN), an attention-based framework that models objects sensed by radar as graphs, to extract individualized relational and contextual features. GQN employs a novel concept of graph queries to dynamically attend over the bird's-eye view (BEV) space, constructing object-specific graphs processed by two novel modules: EdgeFocus for relational reasoning and DeepContext Pooling for contextual aggregation. On the NuScenes dataset, GQN improves relative mAP by up to +53%, including a +8.2% gain over the strongest prior radar method, while reducing peak graph construction overhead by 80% with moderate FLOPs cost.