๐ค AI Summary
This work addresses a key limitation in conventional spatiotemporal graph neural networks (GNNs), which rely on predefined kernel parameters that constrain their expressive power, while existing adaptive approaches often neglect geometric structure and degrade under data sparsity. Theoretically, the paper demonstrates that misspecification of kernel parameters introduces irreducible approximation errors. To overcome this, the authors propose AdaKernelโa method that embeds learnable kernel parameters into GNNs while preserving the underlying physical interaction structure, thereby integrating geometric priors with adaptive flexibility. Through a structure-preserving adaptive mechanism, AdaKernel enhances spatial dependency modeling and consistently improves performance across diverse GNN architectures on Kriging, imputation, and forecasting tasks, outperforming model-agnostic adaptive baselines.
๐ Abstract
Modeling spatial dependencies is central to spatiotemporal data analysis using Graph Neural Networks (GNNs). Traditional methods rely on distance-based kernels with predefined parameters, which restricts model capacity. Although generic adaptive mechanisms (e.g., Graph Attention Networks) offer flexibility, they often fail to capture the underlying geometric structure, performing worse than distance-based models in data-sparse scenarios. Addressing this, we revisit the kernel parameterization problem and theoretically prove that misspecified kernel parameters introduce unavoidable approximation errors in GNNs. To overcome this, we propose AdaKernel, a simple yet effective approach that learns adaptive kernel parameters within the neural network. Unlike methods that learn graph structures from scratch, AdaKernel adopts a structure-preserving strategy that optimizes the scale of physical interactions rather than discarding them. Extensive experiments on Kriging, Imputation, and Forecasting demonstrate that AdaKernel consistently improves various GNN architectures and outperforms model-agnostic adaptive baselines, validating that accurately learned kernel parameters are superior to both fixed priors and fully latent graph structures.