🤖 AI Summary
To address core challenges in physics-informed discovery under scarce and noisy data—namely, poor robustness, low data efficiency, and unreliable uncertainty quantification (UQ)—this paper proposes the first framework integrating replica-exchange stochastic gradient Langevin dynamics (RE-SGLD) with Bayesian active learning. Our method jointly performs system identification and UQ via a hybrid uncertainty–space-filling sampling strategy, enabling robust, interpretable, and data-efficient dynamical modeling. It unifies Bayesian model averaging, uncertainty calibration, and sparse regression priors. Evaluated on multiple nonlinear systems, the approach significantly improves noise robustness and UQ accuracy: data requirements for Lotka–Volterra and Burgers equations are reduced by 60% and 40%, respectively, outperforming random sampling. The framework thus advances physics discovery under realistic, low-quality data regimes.
📝 Abstract
Discovering physical laws from data is a fundamental challenge in scientific research, particularly when high-quality data are scarce or costly to obtain. Traditional methods for identifying dynamical systems often struggle with noise sensitivity, inefficiency in data usage, and the inability to quantify uncertainty effectively. To address these challenges, we propose Langevin-Assisted Active Physical Discovery (LAPD), a Bayesian framework that integrates replica-exchange stochastic gradient Langevin Monte Carlo to simultaneously enable efficient system identification and robust uncertainty quantification (UQ). By balancing gradient-driven exploration in coefficient space and generating an ensemble of candidate models during exploitation, LAPD achieves reliable, uncertainty-aware identification with noisy data. In the face of data scarcity, the probabilistic foundation of LAPD further promotes the integration of active learning (AL) via a hybrid uncertainty-space-filling acquisition function. This strategy sequentially selects informative data to reduce data collection costs while maintaining accuracy. We evaluate LAPD on diverse nonlinear systems such as the Lotka-Volterra, Lorenz, Burgers, and Convection-Diffusion equations, demonstrating its robustness with noisy and limited data as well as superior uncertainty calibration compared to existing methods. The AL extension reduces the required measurements by around 60% for the Lotka-Volterra system and by around 40% for Burgers' equation compared to random data sampling, highlighting its potential for resource-constrained experiments. Our framework establishes a scalable, uncertainty-aware methodology for data-efficient discovery of dynamical systems, with broad applicability to problems where high-fidelity data acquisition is prohibitively expensive.