TransRAD: Retentive Vision Transformer for Enhanced Radar Object Detection

📅 2025-01-29

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Under adverse weather and low-light conditions, camera and LiDAR perception degrades significantly, while radar—despite its robustness—suffers from low resolution, high noise, and poor structural detail, leading to insufficient detection robustness. To address this, we propose the first Retentive Vision Transformer (RVT) framework for 3D object detection directly on RAD (Range-Azimuth-Doppler) radar point clouds. Our key contributions are: (1) Retentive Manhattan Self-Attention (MaSA), which explicitly encodes the inherent Manhattan structural prior in radar data; (2) a RAD-domain-specific feature encoder tailored to radar’s sparsity and anisotropic geometry; and (3) Location-Aware Non-Maximum Suppression (NMS) to mitigate dense, redundant detections. Evaluated on both 2D and 3D radar detection benchmarks, RVT achieves state-of-the-art performance—delivering higher accuracy, faster inference, and lower computational complexity compared to existing methods.

Technology Category

Application Category

📝 Abstract

Despite significant advancements in environment perception capabilities for autonomous driving and intelligent robotics, cameras and LiDARs remain notoriously unreliable in low-light conditions and adverse weather, which limits their effectiveness. Radar serves as a reliable and low-cost sensor that can effectively complement these limitations. However, radar-based object detection has been underexplored due to the inherent weaknesses of radar data, such as low resolution, high noise, and lack of visual information. In this paper, we present TransRAD, a novel 3D radar object detection model designed to address these challenges by leveraging the Retentive Vision Transformer (RMT) to more effectively learn features from information-dense radar Range-Azimuth-Doppler (RAD) data. Our approach leverages the Retentive Manhattan Self-Attention (MaSA) mechanism provided by RMT to incorporate explicit spatial priors, thereby enabling more accurate alignment with the spatial saliency characteristics of radar targets in RAD data and achieving precise 3D radar detection across Range-Azimuth-Doppler dimensions. Furthermore, we propose Location-Aware NMS to effectively mitigate the common issue of duplicate bounding boxes in deep radar object detection. The experimental results demonstrate that TransRAD outperforms state-of-the-art methods in both 2D and 3D radar detection tasks, achieving higher accuracy, faster inference speed, and reduced computational complexity. Code is available at https://github.com/radar-lab/TransRAD

Problem

Research questions and friction points this paper is trying to address.

Autonomous Systems

Environmental Perception

Radar Resolution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Radar Modality Transformation

Target Detection Accuracy

Efficient Resource Consumption

🔎 Similar Papers

No similar papers found.

Authors to Follow