🤖 AI Summary
This study addresses the challenge of high-precision sound speed profile (SSP) reconstruction in the absence of in-situ underwater SSP measurements. We propose MDF-RAGAN, a deep learning model that fuses heterogeneous remote-sensing data—including sea surface temperature—to reconstruct SSPs end-to-end without reliance on shipborne sonar observations. The model incorporates a residual attention mechanism to capture subtle sound speed perturbations and employs cross-modal attention to enable adaptive integration of multi-source information. Evaluated on a public benchmark dataset, MDF-RAGAN achieves a root-mean-square error (RMSE) of 0.3 m/s—nearly doubling the accuracy of conventional CNN-based and spatial interpolation methods, and reducing error by 65.8% relative to the mean SSP baseline. The approach significantly enhances global spatial modeling capability and generalization across diverse oceanic regions.
📝 Abstract
Sound speed profiles (SSPs) are essential parameters underwater that affects the propagation mode of underwater signals and has a critical impact on the energy efficiency of underwater acoustic communication and accuracy of underwater acoustic positioning. Traditionally, SSPs can be obtained by matching field processing (MFP), compressive sensing (CS), and deep learning (DL) methods. However, existing methods mainly rely on on-site underwater sonar observation data, which put forward strict requirements on the deployment of sonar observation systems. To achieve high-precision estimation of sound velocity distribution in a given sea area without on-site underwater data measurement, we propose a multi-modal data-fusion generative adversarial network model with residual attention block (MDF-RAGAN) for SSP construction. To improve the model's ability for capturing global spatial feature correlations, we embedded the attention mechanisms, and use residual modules for deeply capturing small disturbances in the deep ocean sound velocity distribution caused by changes of SST. Experimental results on real open dataset show that the proposed model outperforms other state-of-the-art methods, which achieves an accuracy with an error of less than 0.3m/s. Specifically, MDF-RAGAN not only outperforms convolutional neural network (CNN) and spatial interpolation (SITP) by nearly a factor of two, but also achieves about 65.8% root mean square error (RMSE) reduction compared to mean profile, which fully reflects the enhancement of overall profile matching by multi-source fusion and cross-modal attention.