🤖 AI Summary
Public pixel-level annotated datasets for reinforced concrete defect detection are severely scarce, leading to high false-negative rates and poor model generalization. Method: We construct and open-source the first large-scale RGB concrete defect segmentation dataset (14,805 images), designed for autonomous inspection robots. Leveraging this dataset, we systematically benchmark three segmentation architectures—YOLOv8L-seg, DeepLabV3, and U-Net—and propose a quantitative annotation inconsistency correction method alongside an error mode analysis framework. Results: Experiments reveal that annotation noise has limited impact; data scarcity—not labeling errors—is the primary cause of high false negatives. YOLOv8L-seg achieves 0.59 mIoU on the validation set, and performance improves significantly with dataset expansion. This work is the first to empirically demonstrate the critical role of high-quality, openly available industrial vision data in AI deployment and launches an open-data initiative targeting construction-domain applications.
📝 Abstract
This paper provides a dataset of 14,805 RGB images with segmentation labels for autonomous robotic inspection of reinforced concrete defects. Baselines for the YOLOv8L-seg, DeepLabV3, and U-Net segmentation models are established. Labelling inconsistencies are addressed statistically, and their influence on model performance is analyzed. An error identification tool is employed to examine the error modes of the models. The paper demonstrates that YOLOv8L-seg performs best, achieving a validation mIOU score of up to 0.59. Label inconsistencies were found to have a negligible effect on model performance, while the inclusion of more data improved the performance. False negatives were identified as the primary failure mode. The results highlight the importance of data availability for the performance of deep learning-based models. The lack of publicly available data is identified as a significant contributor to false negatives. To address this, the paper advocates for an increased open-source approach within the construction community.