🤖 AI Summary
This study addresses the challenges of drone-based bridge inspection, where crack features are often subtle, imaging conditions suboptimal, class distributions highly imbalanced, and onboard computational resources limited. To tackle these issues, the authors propose a lightweight convolutional neural network framework that integrates a compact backbone architecture, the CBAM attention mechanism, inspection-scenario-prior-driven directional data augmentation, and Focal Loss for improved learning under imbalance. Grad-CAM visualization is employed to validate the model’s focus on genuine cracks. Evaluated on the SDNET2018 dataset, the proposed model achieves only 11.21 million parameters and 1.82G FLOPs, enabling a high inference speed of 825 FPS while improving F1-score and recall by 2.51% and 3.95%, respectively, thereby significantly enhancing both crack detection robustness and discriminative focus.
📝 Abstract
With the widespread application of Unmanned Aerial Vehicles (UAVs) in bridge structural health monitoring, deep learning-based automatic crack detection has become a major research focus. However, practical UAV inspections still face four key challenges: weak crack features, degraded imaging conditions, severe class imbalance, and limited computational resources for practical UAV inspection workflows. To address these issues, this paper proposes a unified lightweight convolutional neural network framework composed of four synergistic components: a lightweight backbone network, a Convolutional Block Attention Module (CBAM) for channel and spatial enhancement, a directed robust augmentation strategy based on inspection-scene priors, and Focal Loss for hard-sample learning under class imbalance. Experiments on the SDNET2018 bridge deck dataset show that the proposed method achieves an inference speed of 825 FPS with only 11.21M parameters and 1.82G FLOPs. Compared with the baseline model, the complete framework improves the F1-score by 2.51% and recall by 3.95%. In addition, Grad-CAM visualizations indicate that the introduced attention module shifts the model's focus from scattered regions to precise tracking along crack trajectories. Overall, this study achieves a strong balance among accuracy, speed, and robustness, providing a practical solution for ground-station assisted real-time deployment in UAV bridge inspections. The source code is available at: https://github.com/skylynf/AttXNet .