TransRAD: Retentive Vision Transformer for Enhanced Radar Object Detection

📅 2025-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Under adverse weather and low-light conditions, camera and LiDAR perception degrades significantly, while radar—despite its robustness—suffers from low resolution, high noise, and poor structural detail, leading to insufficient detection robustness. To address this, we propose the first Retentive Vision Transformer (RVT) framework for 3D object detection directly on RAD (Range-Azimuth-Doppler) radar point clouds. Our key contributions are: (1) Retentive Manhattan Self-Attention (MaSA), which explicitly encodes the inherent Manhattan structural prior in radar data; (2) a RAD-domain-specific feature encoder tailored to radar’s sparsity and anisotropic geometry; and (3) Location-Aware Non-Maximum Suppression (NMS) to mitigate dense, redundant detections. Evaluated on both 2D and 3D radar detection benchmarks, RVT achieves state-of-the-art performance—delivering higher accuracy, faster inference, and lower computational complexity compared to existing methods.

Technology Category

Application Category

📝 Abstract
Despite significant advancements in environment perception capabilities for autonomous driving and intelligent robotics, cameras and LiDARs remain notoriously unreliable in low-light conditions and adverse weather, which limits their effectiveness. Radar serves as a reliable and low-cost sensor that can effectively complement these limitations. However, radar-based object detection has been underexplored due to the inherent weaknesses of radar data, such as low resolution, high noise, and lack of visual information. In this paper, we present TransRAD, a novel 3D radar object detection model designed to address these challenges by leveraging the Retentive Vision Transformer (RMT) to more effectively learn features from information-dense radar Range-Azimuth-Doppler (RAD) data. Our approach leverages the Retentive Manhattan Self-Attention (MaSA) mechanism provided by RMT to incorporate explicit spatial priors, thereby enabling more accurate alignment with the spatial saliency characteristics of radar targets in RAD data and achieving precise 3D radar detection across Range-Azimuth-Doppler dimensions. Furthermore, we propose Location-Aware NMS to effectively mitigate the common issue of duplicate bounding boxes in deep radar object detection. The experimental results demonstrate that TransRAD outperforms state-of-the-art methods in both 2D and 3D radar detection tasks, achieving higher accuracy, faster inference speed, and reduced computational complexity. Code is available at https://github.com/radar-lab/TransRAD
Problem

Research questions and friction points this paper is trying to address.

Autonomous Systems
Environmental Perception
Radar Resolution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Radar Modality Transformation
Target Detection Accuracy
Efficient Resource Consumption
🔎 Similar Papers
No similar papers found.
L
Lei Cheng
Department of Electrical and Computer Engineering, The University of Arizona, Tucson, AZ 85721 USA
Siyang Cao
Siyang Cao
The University of Arizona
Waveform DesignMIMO RadarSensor FusionMachine Learning on SensorsSignal Processing