🤖 AI Summary
This study addresses the challenges in species distribution modeling posed by the spatiotemporal dynamics of environmental drivers and species co-occurrence, nonlinear community structures, and the long-tailed imbalance induced by rare species. To tackle these issues, the authors propose a joint modeling framework based on a shared latent space that integrates graph-based temporal encoding, context-anchored latent alignment, and a disentangled decoding mechanism. This approach uniquely unifies the modeling of spatiotemporal dynamics, species interactions, and long-tailed learning within a single architecture. Furthermore, it incorporates an asymmetric loss function and supervised contrastive learning to enhance the representation of rare species. Experiments on the large-scale eBird dataset demonstrate that the proposed method significantly outperforms existing models, achieving notable improvements in prediction accuracy for rare species and enabling interpretable discovery of species interactions.
📝 Abstract
Joint Species Distribution Modeling (JSDM) is a key enabler for biodiversity monitoring and conservation planning. However, accurate JSDM faces two coupled challenges: environmental drivers and species distributions are inherently spatio-temporal, while species co-occurrence patterns exhibit complex non-linear community structure and severe long-tail imbalance driven by rare species. Existing approaches often address these factors in isolation, learning from static covariates or neglecting the historical trajectories of dynamic community structure. To overcome these limitations, we propose STELLAR (Spatio-Temporal Environmental Learning with Latent Alignment and Refinement), a novel framework that learns a shared latent space where dynamic habitat context and community structure are optimized jointly. Our approach integrates three complementary components: (1) a Graph-Temporal Encoder that employs graph attention and recurrent units to aggregate spatial neighborhood effects and capture the co-evolving historical dynamics of environmental context and community structure; (2) a Context-Anchored Latent Alignment mechanism that structures the latent space using a label-activated mixture prior and supervised contrastive learning, actively clustering species based on shared environmental preferences; and (3) an Imbalance-Aware Decoupled Decoding module that utilizes Asymmetric Loss to focus learning on hard, rare species samples, preventing mode collapse in the long tail. Experiments on the large-scale eBird dataset, curated with domain experts, demonstrate that our framework significantly outperforms state-of-the-art baselines, particularly in predicting rare species and revealing interpretable species interactions.