🤖 AI Summary
SAS 3D reconstruction requires joint recovery of scatterer spatial distribution and direction-dependent scattering responses; however, conventional time-domain backprojection methods neglect directional scattering, suffering from aliasing and occlusion artifacts, while existing neural volume approaches model voxels as isotropic densities, failing to capture anisotropic scattering characteristics. This paper proposes SH-SAS—the first framework embedding spherical harmonics (SH) into implicit neural representations—where complex SH coefficients are modeled as continuous functions: the zeroth-order coefficient encodes scattering density, and higher-order coefficients encode angular response. A lightweight MLP, driven by multi-resolution hash encoding, is trained end-to-end directly from 1D time-of-flight signals without beamforming supervision. Evaluated on both synthetic and real-world SAS data, SH-SAS achieves significantly improved 3D reconstruction accuracy and geometric fidelity, effectively mitigating aliasing and occlusion effects.
📝 Abstract
Synthetic aperture sonar (SAS) reconstruction requires recovering both the spatial distribution of acoustic scatterers and their direction-dependent response. Time-domain backprojection is the most common 3D SAS reconstruction algorithm, but it does not model directionality and can suffer from sampling limitations, aliasing, and occlusion. Prior neural volumetric methods applied to synthetic aperture sonar treat each voxel as an isotropic scattering density, not modeling anisotropic returns. We introduce SH-SAS, an implicit neural representation that expresses the complex acoustic scattering field as a set of spherical harmonic (SH) coefficients. A multi-resolution hash encoder feeds a lightweight MLP that outputs complex SH coefficients up to a specified degree L. The zeroth-order coefficient acts as an isotropic scattering field, which also serves as the density term, while higher orders compactly capture directional scattering with minimal parameter overhead. Because the model predicts the complex amplitude for any transmit-receive baseline, training is performed directly from 1-D time-of-flight signals without the need to beamform intermediate images for supervision. Across synthetic and real SAS (both in-air and underwater) benchmarks, results show that SH-SAS performs better in terms of 3D reconstruction quality and geometric metrics than previous methods.