🤖 AI Summary
To address the challenges of weak hyperspectral image (HSI) representation learning and poor soil organic carbon (SOC) estimation performance under limited labeled samples, this paper proposes SpecBPP—a self-supervised pretraining framework based on spectral band permutation prediction. Leveraging the intrinsic spectral continuity of HSI, SpecBPP formulates spectral channel order recovery as a pretext task and incorporates a curriculum learning strategy to progressively increase permutation complexity, effectively mitigating the factorial-scale search space challenge. After pretraining on EnMAP data and fine-tuning for SOC estimation, SpecBPP achieves an R² of 0.9456, RMSE of 1.1053%, and RPD of 4.19—substantially outperforming MAE, JEPA, and conventional methods. The core innovation lies in the first incorporation of spectral sequence ordering as a structural prior into self-supervised learning, enabling efficient and discriminative representation learning via curriculum-driven permutation prediction.
📝 Abstract
Self-supervised learning has revolutionized representation learning in vision and language, but remains underexplored for hyperspectral imagery (HSI), where the sequential structure of spectral bands offers unique opportunities. In this work, we propose Spectral Band Permutation Prediction (SpecBPP), a novel self-supervised learning framework that leverages the inherent spectral continuity in HSI. Instead of reconstructing masked bands, SpecBPP challenges a model to recover the correct order of shuffled spectral segments, encouraging global spectral understanding. We implement a curriculum-based training strategy that progressively increases permutation difficulty to manage the factorial complexity of the permutation space. Applied to Soil Organic Carbon (SOC) estimation using EnMAP satellite data, our method achieves state-of-the-art results, outperforming both masked autoencoder (MAE) and joint-embedding predictive (JEPA) baselines. Fine-tuned on limited labeled samples, our model yields an $R^2$ of 0.9456, RMSE of 1.1053%, and RPD of 4.19, significantly surpassing traditional and self-supervised benchmarks. Our results demonstrate that spectral order prediction is a powerful pretext task for hyperspectral understanding, opening new avenues for scientific representation learning in remote sensing and beyond.