RoadMamba: A Dual Branch Visual State Space Model for Road Surface Classification

📅 2025-08-02

📈 Citations: 0

✨ Influential: 0

career value

233K/year

🤖 AI Summary

Existing Mamba-based vision models for road surface classification suffer from insufficient local texture modeling, limiting their performance relative to state-of-the-art (SOTA) methods. Method: We propose DualSSM, a dual-branch visual state space model comprising a semantic backbone branch for global context modeling and a dedicated local branch for fine-grained texture perception. To synergistically integrate features from both branches, we introduce a Dual Attention Fusion (DAF) mechanism that adaptively aggregates multi-scale representations. Additionally, we design a dual auxiliary loss function to jointly optimize local and global feature learning. Contribution/Results: This work presents the first systematic application of the Mamba architecture to road surface classification, trained end-to-end on a million-scale dataset. Extensive experiments demonstrate significant improvements in recognition accuracy and generalization robustness under complex road conditions, achieving SOTA performance across multiple benchmarks.

Technology Category

Application Category

📝 Abstract

Acquiring the road surface conditions in advance based on visual technologies provides effective information for the planning and control system of autonomous vehicles, thus improving the safety and driving comfort of the vehicles. Recently, the Mamba architecture based on state-space models has shown remarkable performance in visual processing tasks, benefiting from the efficient global receptive field. However, existing Mamba architectures struggle to achieve state-of-the-art visual road surface classification due to their lack of effective extraction of the local texture of the road surface. In this paper, we explore for the first time the potential of visual Mamba architectures for road surface classification task and propose a method that effectively combines local and global perception, called RoadMamba. Specifically, we utilize the Dual State Space Model (DualSSM) to effectively extract the global semantics and local texture of the road surface and decode and fuse the dual features through the Dual Attention Fusion (DAF). In addition, we propose a dual auxiliary loss to explicitly constrain dual branches, preventing the network from relying only on global semantic information from the deep large receptive field and ignoring the local texture. The proposed RoadMamba achieves the state-of-the-art performance in experiments on a large-scale road surface classification dataset containing 1 million samples.

Problem

Research questions and friction points this paper is trying to address.

Classify road surfaces using visual Mamba architectures

Combine local and global perception for better accuracy

Improve autonomous vehicle safety with road condition data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual State Space Model for global and local features

Dual Attention Fusion for feature decoding

Dual auxiliary loss to constrain branches

🔎 Similar Papers

No similar papers found.