🤖 AI Summary
To address the challenge of jointly estimating endmembers and abundances in hyperspectral unmixing, this paper proposes a three-stage self-supervised wavelet neural network. Methodologically, it innovatively integrates biorthogonal wavelet transforms into an autoencoder architecture to enable physically grounded, multi-scale, and sparse wavelet-domain representation of spectral bands. By incorporating implicit symmetry modeling and forward physical constraints—namely abundance sum-to-one and non-negativity—the model achieves end-to-end joint optimization of endmembers and abundances without ground-truth labels. The architecture combines Sigmoid activation, Dropout regularization, and spectral library-based kernel constraints, optimized via the Adam algorithm. Extensive experiments on multiple synthetic and real-world datasets demonstrate that the proposed method significantly outperforms state-of-the-art deep unmixing approaches in unmixing accuracy, model compactness, and robustness to noise.
📝 Abstract
In this article, we present SWAN: a three-stage, self-supervised wavelet neural network for joint estimation of endmembers and abundances from hyperspectral imagery. The contiguous and overlapping hyperspectral band images are first expanded to Biorthogonal wavelet basis space that provides sparse, distributed, and multi-scale representations. The idea is to exploit latent symmetries from thus obtained invariant and covariant features using a self-supervised learning paradigm. The first stage, SWANencoder maps the input wavelet coefficients to a compact lower-dimensional latent space. The second stage, SWANdecoder uses the derived latent representation to reconstruct the input wavelet coefficients. Interestingly, the third stage SWANforward learns the underlying physics of the hyperspectral image. A three-stage combined loss function is formulated in the image acquisition domain that eliminates the need for ground truth and enables self-supervised training. Adam is employed for optimizing the proposed loss function, while Sigmoid with a dropout of 0.3 is incorporated to avoid possible overfitting. Kernel regularizers bound the magnitudes and preserve spatial variations in the estimated endmember coefficients. The output of SWANencoder represents estimated abundance maps during inference, while weights of SWANdecoder are retrieved to extract endmembers. Experiments are conducted on two benchmark synthetic data sets with different signal-to-noise ratios as well as on three real benchmark hyperspectral data sets while comparing the results with several state-of-the-art neural network-based unmixing methods. The qualitative, quantitative, and ablation results show performance enhancement by learning a resilient unmixing function as well as promoting self-supervision and compact network parameters for practical applications.