Calibrating Geophysical Predictions under Constrained Probabilistic Distributions

📅 2025-11-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Machine learning models for geophysical systems often achieve high short-term prediction accuracy but fail to preserve long-term statistical properties—such as marginal distributions of state variables—and struggle to maintain dynamical consistency under sparse-data regimes. Method: This paper proposes a distribution-aware machine learning framework that incorporates known marginal distributions as physical priors. It enforces non-local physics-informed consistency via the kernelized Stein discrepancy (KSD) in a reproducing kernel Hilbert space, coupled with a normalization calibration mechanism to jointly optimize point-wise predictions and long-term attractor fidelity. Results: Evaluated on offline CO₂ flux inversion and online quasi-geostrophic flow simulation, the method significantly improves short-term forecasting accuracy while rigorously preserving long-term marginal distributions—resolving the critical trade-off between “accurate prediction” and “statistical bias” under data sparsity.

Technology Category

Application Category

📝 Abstract
Machine learning (ML) has shown significant promise in studying complex geophysical dynamical systems, including turbulence and climate processes. Such systems often display sensitive dependence on initial conditions, reflected in positive Lyapunov exponents, where even small perturbations in short-term forecasts can lead to large deviations in long-term outcomes. Thus, meaningful inference requires not only accurate short-term predictions, but also consistency with the system's long-term attractor that is captured by the marginal distribution of state variables. Existing approaches attempt to address this challenge by incorporating spatial and temporal dependence, but these strategies become impractical when data are extremely sparse. In this work, we show that prior knowledge of marginal distributions offers valuable complementary information to short-term observations, motivating a distribution-informed learning framework. We introduce a calibration algorithm based on normalization and the Kernelized Stein Discrepancy (KSD) to enhance ML predictions. The method here employs KSD within a reproducing kernel Hilbert space to calibrate model outputs, improving their fidelity to known physical distributions. This not only sharpens pointwise predictions but also enforces consistency with non-local statistical structures rooted in physical principles. Through synthetic experiments-spanning offline climatological CO2 fluxes and online quasi-geostrophic flow simulations-we demonstrate the robustness and broad utility of the proposed framework.
Problem

Research questions and friction points this paper is trying to address.

Calibrating geophysical predictions with constrained distributions
Improving ML model fidelity to known physical distributions
Addressing sparse data challenges in geophysical inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Calibration algorithm using normalization and Kernelized Stein Discrepancy
Distribution-informed learning framework leveraging prior marginal distributions
Enhancing ML predictions with consistency to physical statistical structures
🔎 Similar Papers
No similar papers found.
Z
Zhewen Hou
Department of Statistics, Columbia University, NY
J
Jiajin Sun
Department of Statistics, Florida State University, FL
S
Subashree Venkatasubramanian
Department of Computer Science, Columbia University, NY
Peter Jin
Peter Jin
UC Berkeley
Machine LearningArtificial Intelligence
S
Shuolin Li
NSF Science and Technology Center for Learning the Earth with AI and Physics, Columbia University, NY
T
Tian Zheng
Department of Statistics, Columbia University, NY; NSF Science and Technology Center for Learning the Earth with AI and Physics, Columbia University, NY