🤖 AI Summary
Vector quantization (VQ) suffers from non-differentiability and hard assignment in spatiotemporal forecasting, impeding end-to-end optimization and limiting representational capacity and prediction accuracy. To address this, we propose differentiable sparse soft vector quantization (SVQ), the first soft VQ framework for spatiotemporal forecasting driven by sparse regression. SVQ enables fully differentiable, end-to-end training while enhancing pattern modeling capability. It employs a two-layer MLP coupled with a large-scale codebook, implements soft assignment, propagates continuous gradients through the quantizer, and adopts lightweight decoding to preserve fine-grained details while suppressing noise. Evaluated on five standard benchmarks, SVQ achieves state-of-the-art performance: it reduces temperature prediction error by 7.9% on WeatherBench-S, lowers average MAE by 9.4% in video forecasting, and improves LPIPS image quality by 17.3%.
📝 Abstract
Spatio-temporal forecasting is crucial in various fields and requires a careful balance between identifying subtle patterns and filtering out noise. Vector quantization (VQ) appears well-suited for this purpose, as it quantizes input vectors into a set of codebook vectors or patterns. Although VQ has shown promise in various computer vision tasks, it surprisingly falls short in enhancing the accuracy of spatio-temporal forecasting. We attribute this to two main issues: inaccurate optimization due to non-differentiability and limited representation power in hard-VQ. To tackle these challenges, we introduce Differentiable Sparse Soft-Vector Quantization (SVQ), the first VQ method to enhance spatio-temporal forecasting. SVQ balances detail preservation with noise reduction, offering full differentiability and a solid foundation in sparse regression. Our approach employs a two-layer MLP and an extensive codebook to streamline the sparse regression process, significantly cutting computational costs while simplifying training and improving performance. Empirical studies on five spatio-temporal benchmark datasets show SVQ achieves state-of-the-art results, including a 7.9% improvement on the WeatherBench-S temperature dataset and an average mean absolute error reduction of 9.4% in video prediction benchmarks (Human3.6M, KTH, and KittiCaltech), along with a 17.3% enhancement in image quality (LPIPS). Code is publicly available at https://github.com/Pachark/SVQ-Forecasting.