Change State Space Models for Remote Sensing Change Detection

📅 2025-04-15

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

Convolutional neural networks (ConvNets) struggle to model long-range dependencies in bitemporal remote sensing change detection, while vision transformers (ViTs) incur prohibitive computational overhead; existing Mamba-based approaches serve only as generic backbones without task-specific design for change modeling. Method: We propose the Change State Space Model (CSSM), the first state space model deeply customized for change detection. CSSM employs a dual-branch differential encoding architecture that jointly performs temporal feature alignment and change gating, explicitly suppressing irrelevant information and focusing on change regions via a difference-aware structure—eliminating reliance on CNN or ViT backbones. Contribution/Results: On three standard benchmarks, CSSM achieves superior accuracy and robustness to input degradation, while reducing computational complexity by an order of magnitude compared to CNN, ViT, and Mamba baselines—demonstrating both parameter efficiency and high inference speed.

Technology Category

Application Category

📝 Abstract

Despite their frequent use for change detection, both ConvNets and Vision transformers (ViT) exhibit well-known limitations, namely the former struggle to model long-range dependencies while the latter are computationally inefficient, rendering them challenging to train on large-scale datasets. Vision Mamba, an architecture based on State Space Models has emerged as an alternative addressing the aforementioned deficiencies and has been already applied to remote sensing change detection, though mostly as a feature extracting backbone. In this article the Change State Space Model is introduced, that has been specifically designed for change detection by focusing on the relevant changes between bi-temporal images, effectively filtering out irrelevant information. By concentrating solely on the changed features, the number of network parameters is reduced, enhancing significantly computational efficiency while maintaining high detection performance and robustness against input degradation. The proposed model has been evaluated via three benchmark datasets, where it outperformed ConvNets, ViTs, and Mamba-based counterparts at a fraction of their computational complexity. The implementation will be made available at https://github.com/Elman295/CSSM upon acceptance.

Problem

Research questions and friction points this paper is trying to address.

Addresses limitations of ConvNets and ViTs in remote sensing change detection

Proposes Change State Space Model for efficient bi-temporal image analysis

Reduces parameters while maintaining high detection performance and robustness

Innovation

Methods, ideas, or system contributions that make the work stand out.

State Space Model for change detection

Focuses on relevant bi-temporal changes

Reduces parameters, enhances efficiency

🔎 Similar Papers

Hierarchical Attention Diffusion Networks with Object Priors for Video Change Detection