MPCM-Net: Multi-scale network integrates partial attention convolution with Mamba for ground-based cloud image segmentation

📅 2025-11-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient multi-scale feature extraction, inefficient attention mechanisms, and inadequate global dependency modeling in ground-based cloud image segmentation, this paper proposes a lightweight and efficient network integrating partial attention convolution (ParCM/ParAM) with the Mamba state space model. We design a Partial Selection Module (ParSM) to enhance channel-wise interaction and cross-scale spatial response, construct a linear-complexity M2B decoder architecture, and introduce an SSH dual-path aggregation module to strengthen hierarchical feature correlation. Additionally, we release CSRC—a high-quality, fine-grained cloud image segmentation dataset. Experiments on CSRC demonstrate that our method significantly outperforms state-of-the-art approaches, achieving a 3.2% accuracy gain and 2.1× faster inference speed. To the best of our knowledge, this is the first work to achieve simultaneous high accuracy and low latency in cloud image segmentation, providing reliable support for real-time applications such as photovoltaic power forecasting.

Technology Category

Application Category

📝 Abstract
Ground-based cloud image segmentation is a critical research domain for photovoltaic power forecasting. Current deep learning approaches primarily focus on encoder-decoder architectural refinements. However, existing methodologies exhibit several limitations:(1)they rely on dilated convolutions for multi-scale context extraction, lacking the partial feature effectiveness and interoperability of inter-channel;(2)attention-based feature enhancement implementations neglect accuracy-throughput balance; and (3)the decoder modifications fail to establish global interdependencies among hierarchical local features, limiting inference efficiency. To address these challenges, we propose MPCM-Net, a Multi-scale network that integrates Partial attention Convolutions with Mamba architectures to enhance segmentation accuracy and computational efficiency. Specifically, the encoder incorporates MPAC, which comprises:(1)a MPC block with ParCM and ParSM that enables global spatial interaction across multi-scale cloud formations, and (2)a MPA block combining ParAM and ParSM to extract discriminative features with reduced computational complexity. On the decoder side, a M2B is employed to mitigate contextual loss through a SSHD that maintains linear complexity while enabling deep feature aggregation across spatial and scale dimensions. As a key contribution to the community, we also introduce and release a dataset CSRC, which is a clear-label, fine-grained segmentation benchmark designed to overcome the critical limitations of existing public datasets. Extensive experiments on CSRC demonstrate the superior performance of MPCM-Net over state-of-the-art methods, achieving an optimal balance between segmentation accuracy and inference speed. The dataset and source code will be available at https://github.com/she1110/CSRC.
Problem

Research questions and friction points this paper is trying to address.

Improves multi-scale context extraction in cloud image segmentation
Balances accuracy and computational efficiency in feature enhancement
Establishes global interdependencies among hierarchical local features
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates partial attention convolution with Mamba architecture
Uses MPC and MPA blocks for multi-scale feature extraction
Employs M2B decoder with SSHD for linear complexity aggregation
🔎 Similar Papers
No similar papers found.
P
Penghui Niu
School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
J
Jiashuai She
School of Electrical Engineering, Hebei University of Technology, Tianjin 300401, China
Taotao Cai
Taotao Cai
University of Southern Queensland
Y
Yajuan Zhang
School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
P
Ping Zhang
School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China, and also with the Hebei Province Key Laboratory of Big Data Calculation, Hebei University of Technology, Tianjin 300401, China
J
Junhua Gu
School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China, and also with the Hebei Province Key Laboratory of Big Data Calculation, Hebei University of Technology, Tianjin 300401, China
J
Jianxin Li
Discipline of Business Systems and Operations, School of Business and Law, Edith Cowan University, Joondalup, WA 6027, Australia