π€ AI Summary
Remote sensing semantic segmentation suffers from annotation scarcity and poor generalization across sensor modalities, illumination conditions, and geographic domains. To address these challenges, we propose a novel domain generalization framework that synergistically integrates geospatial foundation models with masked autoencoders (MAEs), introducing a first-of-its-kind soft-alignment pseudo-labeling mechanism for generative pretraining from source to target domains. We theoretically analyze MAEsβ role in learning domain-invariant feature representations. Our method operates without any target-domain annotations and significantly improves cross-domain segmentation accuracy and robustness on hyperspectral and multispectral data. It achieves sensor-agnostic strong generalization across multiple remote sensing benchmarks. By unifying scalable self-supervision with interpretable pseudo-label alignment, our approach establishes a new paradigm for low-resource remote sensing interpretation.
π Abstract
Remote sensing enables a wide range of critical applications such as land cover and land use mapping, crop yield prediction, and environmental monitoring. Advances in satellite technology have expanded remote sensing datasets, yet high-performance segmentation models remain dependent on extensive labeled data, challenged by annotation scarcity and variability across sensors, illumination, and geography. Domain adaptation offers a promising solution to improve model generalization. This paper introduces a domain generalization approach to leveraging emerging geospatial foundation models by combining soft-alignment pseudo-labeling with source-to-target generative pre-training. We further provide new mathematical insights into MAE-based generative learning for domain-invariant feature learning. Experiments with hyperspectral and multispectral remote sensing datasets confirm our method's effectiveness in enhancing adaptability and segmentation.