π€ AI Summary
Underwater images suffer from blurriness, low contrast, and color distortion, degrading object detection accuracy; moreover, the scarcity of paired training data hinders end-to-end optimization. Method: We propose a physics-constrained, self-supervised, multi-task lightweight framework that models underwater imaging as a differentiable decomposition into clean image, background light, and transmission mapβenabling joint optimization of enhancement and detection without paired data. Feature sharing and a dynamic gating mechanism foster task synergy while avoiding error accumulation inherent in two-stage pipelines. Results: Our method achieves a 5.2% mAP improvement over prior art, boosts PSNR and SSIM by over 12%, and runs at 32 FPS on multiple underwater benchmarks, significantly outperforming state-of-the-art approaches.
π Abstract
Underwater optical images inevitably suffer from various degradation factors such as blurring, low contrast, and color distortion, which hinder the accuracy of object detection tasks. Due to the lack of paired underwater/clean images, most research methods adopt a strategy of first enhancing and then detecting, resulting in a lack of feature communication between the two learning tasks. On the other hand, due to the contradiction between the diverse degradation factors of underwater images and the limited number of samples, existing underwater enhancement methods are difficult to effectively enhance degraded images of unknown water bodies, thereby limiting the improvement of object detection accuracy. Therefore, most underwater target detection results are still displayed on degraded images, making it difficult to visually judge the correctness of the detection results. To address the above issues, this paper proposes a multi-task learning method that simultaneously enhances underwater images and improves detection accuracy. Compared with single-task learning, the integrated model allows for the dynamic adjustment of information communication and sharing between different tasks. Due to the fact that real underwater images can only provide annotated object labels, this paper introduces physical constraints to ensure that object detection tasks do not interfere with image enhancement tasks. Therefore, this article introduces a physical module to decompose underwater images into clean images, background light, and transmission images and uses a physical model to calculate underwater images for self-supervision. Numerical experiments demonstrate that the proposed model achieves satisfactory results in visual performance, object detection accuracy, and detection efficiency compared to state-of-the-art comparative methods.