BIMII-Net: Brain-Inspired Multi-Iterative Interactive Network for RGB-T Road Scene Semantic Segmentation

📅 2025-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address coarse-grained multimodal feature fusion and insufficient hierarchical disparity modeling in RGB-T road scene semantic segmentation under low illumination and occlusion, this paper proposes a brain-inspired multi-iterative interaction network. Methodologically, it introduces: (1) a Deep Continuous Coupling Neural Network (DCCNN) that enables layer-wise dynamic alignment between RGB and thermal features; (2) a Cross-modal Explicit Attention-enhanced Fusion (CEAEF) module that explicitly models inter-modal complementarity and hierarchical discrepancies; and (3) a complementary multi-level iterative decoding architecture supporting cyclic enhancement of shallow and deep features, jointly supervised across modules. Evaluated on multiple RGB-T benchmark datasets, the method achieves state-of-the-art performance, significantly improving texture detail recovery and global structural consistency while demonstrating strong generalization capability.

Technology Category

Application Category

📝 Abstract
RGB-T road scene semantic segmentation enhances visual scene understanding in complex environments characterized by inadequate illumination or occlusion by fusing information from RGB and thermal images. Nevertheless, existing RGB-T semantic segmentation models typically depend on simple addition or concatenation strategies or ignore the differences between information at different levels. To address these issues, we proposed a novel RGB-T road scene semantic segmentation network called Brain-Inspired Multi-Iteration Interaction Network (BIMII-Net). First, to meet the requirements of accurate texture and local information extraction in road scenarios like autonomous driving, we proposed a deep continuous-coupled neural network (DCCNN) architecture based on a brain-inspired model. Second, to enhance the interaction and expression capabilities among multi-modal information, we designed a cross explicit attention-enhanced fusion module (CEAEF-Module) in the feature fusion stage of BIMII-Net to effectively integrate features at different levels. Finally, we constructed a complementary interactive multi-layer decoder structure, incorporating the shallow-level feature iteration module (SFI-Module), the deep-level feature iteration module (DFI-Module), and the multi-feature enhancement module (MFE-Module) to collaboratively extract texture details and global skeleton information, with multi-module joint supervision further optimizing the segmentation results. Experimental results demonstrate that BIMII-Net achieves state-of-the-art (SOTA) performance in the brain-inspired computing domain and outperforms most existing RGB-T semantic segmentation methods. It also exhibits strong generalization capabilities on multiple RGB-T datasets, proving the effectiveness of brain-inspired computer models in multi-modal image segmentation tasks.
Problem

Research questions and friction points this paper is trying to address.

Improves RGB-T road scene segmentation in complex environments
Addresses inadequate fusion of multi-level RGB-T information
Enhances interaction among multi-modal features for better accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep continuous-coupled neural network for RGB-T segmentation
Cross explicit attention-enhanced fusion module for multi-modal features
Complementary interactive multi-layer decoder with joint supervision
🔎 Similar Papers
No similar papers found.
H
Hanshuo Qiu
School of Information Science and Engineering, Lanzhou University, No.222, TianShui Road(south), Lanzhou, 730000, Gansu, China
J
Jie Jiang
College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
R
Ruoli Yang
College of Systems Engineering, National University of Defense Technology, Changsha 410073, China
Lixin Zhan
Lixin Zhan
PhD Student at National University of Defense Technology
Point Cloud3D Computer Visionmedical image segmentation
Jizhao Liu
Jizhao Liu
Associate Professor@Lanzhou University
ChaosNonlinear DynamicsBrain-inspired ComputingVisual CognitionComputational Neuroscience