Staircase Cascaded Fusion of Lightweight Local Pattern Recognition and Long-Range Dependencies for Structural Crack Segmentation

📅 2024-08-23
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing crack detection methods struggle to simultaneously achieve pixel-level accuracy, effective local texture modeling, and long-range pixel dependency capture, while suffering from excessive model parameters and high computational overhead—hindering edge deployment. To address these challenges, this paper proposes a lightweight pixel-wise segmentation framework. We design a stepwise cascaded fusion module that jointly optimizes efficient local pattern recognition and long-range dependency modeling for the first time; introduce a network-wide lightweight convolutional block to significantly reduce computational cost; and integrate multi-scale contextual modeling with noise-robust enhancement mechanisms. Evaluated on our newly established TUT benchmark and five public datasets, the method achieves state-of-the-art performance: F1 = 0.8382 and mIoU = 0.8473 on TUT, with the lowest parameter count and FLOPs among all existing approaches.

Technology Category

Application Category

📝 Abstract
Detecting cracks with pixel-level precision for key structures is a significant challenge, as existing methods struggle to effectively integrate local textures and pixel dependencies of cracks. Furthermore, these methods often possess numerous parameters and substantial computational requirements, complicating deployment on edge control devices. In this paper, we propose a staircase cascaded fusion crack segmentation network (CrackSCF) that generates high-quality crack segmentation maps using minimal computational resources. We constructed a staircase cascaded fusion module that effectively captures local patterns of cracks and long-range dependencies of pixels, and it can suppress background noise well. To reduce the computational resources required by the model, we introduced a lightweight convolution block, which replaces all convolution operations in the network, significantly reducing the required computation and parameters without affecting the network's performance. To evaluate our method, we created a challenging benchmark dataset called TUT and conducted experiments on this dataset and five other public datasets. The experimental results indicate that our method offers significant advantages over existing methods, especially in handling background noise interference and detailed crack segmentation. The F1 and mIoU scores on the TUT dataset are 0.8382 and 0.8473, respectively, achieving state-of-the-art (SOTA) performance while requiring the least computational resources. The code and dataset is available at https://github.com/Karl1109/CrackSCF.
Problem

Research questions and friction points this paper is trying to address.

Pixel-level crack detection for key structures
Integration of local textures and long-range dependencies
Reducing computational overhead for edge devices
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight convolutional block reduces computation
Lightweight long-range dependency extractor enhances features
Staircase cascaded fusion integrates local and global details
🔎 Similar Papers
No similar papers found.
H
Hui Liu
Engineering Research Center of Learning-Based Intelligent System (Ministry of Education), Key Laboratory of Computer Vision and System (Ministry of Education), and School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, 300384, China
C
Chen Jia
Engineering Research Center of Learning-Based Intelligent System (Ministry of Education), Key Laboratory of Computer Vision and System (Ministry of Education), and School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, 300384, China
F
Fan Shi
Engineering Research Center of Learning-Based Intelligent System (Ministry of Education), Key Laboratory of Computer Vision and System (Ministry of Education), and School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, 300384, China
X
Xu Cheng
Engineering Research Center of Learning-Based Intelligent System (Ministry of Education), Key Laboratory of Computer Vision and System (Ministry of Education), and School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, 300384, China
M
Mianzhao Wang
Engineering Research Center of Learning-Based Intelligent System (Ministry of Education), Key Laboratory of Computer Vision and System (Ministry of Education), and School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, 300384, China
S
Shengyong Chen
Engineering Research Center of Learning-Based Intelligent System (Ministry of Education), Key Laboratory of Computer Vision and System (Ministry of Education), and School of Computer Science and Engineering, Tianjin University of Technology, Tianjin, 300384, China