🤖 AI Summary
This study addresses the challenges of automatic sea ice classification in Arctic SAR imagery, particularly the difficulty in distinguishing morphologically similar ice types and severe class imbalance. Building upon the AI4Arctic/ASIP dataset, the authors establish a Vision Transformer (ViT) baseline framework using SAR-only inputs, leveraging full-resolution Sentinel-1 data, a hierarchical tile-splitting strategy to prevent information leakage, training-set normalization, and the SIGRID-3 labeling scheme. To improve the precision-recall trade-off for rare ice classes, they incorporate focal loss into the training objective. Experimental results demonstrate that the ViT-Large model with focal loss achieves 69.6% overall accuracy and a weighted F1-score of 68.8% on an independent test set, with multi-year ice classified at 83.9% accuracy, thereby providing a robust SAR-only baseline for future multimodal fusion approaches.
📝 Abstract
Accurate and automated sea ice classification is important for climate monitoring and maritime safety in the Arctic. While Synthetic Aperture Radar (SAR) is the operational standard because of its all-weather capability, it remains challenging to distinguish morphologically similar ice classes under severe class imbalance. Rather than claiming a fully validated multimodal system, this paper establishes a trustworthy SAR only baseline that future fusion work can build upon. Using the AI4Arctic/ASIP Sea Ice Dataset (v2), which contains 461 Sentinel-1 scenes matched with expert ice charts, we combine full-resolution Sentinel-1 Extra Wide inputs, leakage-aware stratified patch splitting, SIGRID-3 stage-of-development labels, and training-set normalization to evaluate Vision Transformer baselines. We compare ViT-Base models trained with cross entropy and weighted cross-entropy against a ViT-Large model trained with focal loss. Among the tested configurations, ViT-Large with focal loss achieves 69.6% held-out accuracy, 68.8% weighted F1, and 83.9% precision on the minority Multi-Year Ice class. These results show that focal-loss training offers a more useful precision-recall trade-off than weighted cross-entropy for rare ice classes and establishes a cleaner baseline for future multimodal fusion with optical, thermal, or meteorological data.