BIMII-Net: Brain-Inspired Multi-Iterative Interactive Network for RGB-T Road Scene Semantic Segmentation

📅 2025-03-25

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address coarse-grained multimodal feature fusion and insufficient hierarchical disparity modeling in RGB-T road scene semantic segmentation under low illumination and occlusion, this paper proposes a brain-inspired multi-iterative interaction network. Methodologically, it introduces: (1) a Deep Continuous Coupling Neural Network (DCCNN) that enables layer-wise dynamic alignment between RGB and thermal features; (2) a Cross-modal Explicit Attention-enhanced Fusion (CEAEF) module that explicitly models inter-modal complementarity and hierarchical discrepancies; and (3) a complementary multi-level iterative decoding architecture supporting cyclic enhancement of shallow and deep features, jointly supervised across modules. Evaluated on multiple RGB-T benchmark datasets, the method achieves state-of-the-art performance, significantly improving texture detail recovery and global structural consistency while demonstrating strong generalization capability.

Technology Category

Application Category

📝 Abstract

RGB-T road scene semantic segmentation enhances visual scene understanding in complex environments characterized by inadequate illumination or occlusion by fusing information from RGB and thermal images. Nevertheless, existing RGB-T semantic segmentation models typically depend on simple addition or concatenation strategies or ignore the differences between information at different levels. To address these issues, we proposed a novel RGB-T road scene semantic segmentation network called Brain-Inspired Multi-Iteration Interaction Network (BIMII-Net). First, to meet the requirements of accurate texture and local information extraction in road scenarios like autonomous driving, we proposed a deep continuous-coupled neural network (DCCNN) architecture based on a brain-inspired model. Second, to enhance the interaction and expression capabilities among multi-modal information, we designed a cross explicit attention-enhanced fusion module (CEAEF-Module) in the feature fusion stage of BIMII-Net to effectively integrate features at different levels. Finally, we constructed a complementary interactive multi-layer decoder structure, incorporating the shallow-level feature iteration module (SFI-Module), the deep-level feature iteration module (DFI-Module), and the multi-feature enhancement module (MFE-Module) to collaboratively extract texture details and global skeleton information, with multi-module joint supervision further optimizing the segmentation results. Experimental results demonstrate that BIMII-Net achieves state-of-the-art (SOTA) performance in the brain-inspired computing domain and outperforms most existing RGB-T semantic segmentation methods. It also exhibits strong generalization capabilities on multiple RGB-T datasets, proving the effectiveness of brain-inspired computer models in multi-modal image segmentation tasks.

Problem

Research questions and friction points this paper is trying to address.

Improves RGB-T road scene segmentation in complex environments

Addresses inadequate fusion of multi-level RGB-T information

Enhances interaction among multi-modal features for better accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep continuous-coupled neural network for RGB-T segmentation

Cross explicit attention-enhanced fusion module for multi-modal features

Complementary interactive multi-layer decoder with joint supervision

🔎 Similar Papers

No similar papers found.

Authors to Follow