Towards a data-scale independent regulariser for robust sparse identification of non-linear dynamics

📅 2026-03-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses a critical limitation of amplitude-threshold-based sparse regression methods such as SINDy: data normalization disrupts their sparsity assumptions, yielding dense, uninterpretable models that violate physical principles. To overcome this, the authors propose the STCV algorithm, which—unlike prior approaches—uses statistical significance rather than coefficient magnitude as the sparsity criterion. By introducing a dimensionless metric termed “coefficient presence” (CP) and integrating sequential thresholding with the coefficient of variation, STCV robustly identifies true dynamical terms irrespective of data scaling. This renders SINDy inherently invariant to both normalization and noise. Experiments demonstrate that STCV substantially outperforms STLSQ and E-SINDy across multiple canonical systems and real-world mass-spring-damper data, accurately recovering sparse, physically consistent governing equations.

Technology Category

Application Category

📝 Abstract
Data normalisation, a common and often necessary preprocessing step in engineering and scientific applications, can severely distort the discovery of governing equations by magnitudebased sparse regression methods. This issue is particularly acute for the Sparse Identification of Nonlinear Dynamics (SINDy) framework, where the core assumption of sparsity is undermined by the interaction between data scaling and measurement noise. The resulting discovered models can be dense, uninterpretable, and physically incorrect. To address this critical vulnerability, we introduce the Sequential Thresholding of Coefficient of Variation (STCV), a novel, computationally efficient sparse regression algorithm that is inherently robust to data scaling. STCV replaces conventional magnitude-based thresholding with a dimensionless statistical metric, the Coefficient Presence (CP), which assesses the statistical validity and consistency of candidate terms in the model library. This shift from magnitude to statistical significance makes the discovery process invariant to arbitrary data scaling. Through comprehensive benchmarking on canonical dynamical systems and practical engineering problems, including a physical mass-spring-damper experiment, we demonstrate that STCV consistently and significantly outperforms standard Sequential Thresholding Least Squares (STLSQ) and Ensemble-SINDy (E-SINDy) on normalised, noisy datasets. The results show that STCV-based methods can successfully identify the correct, sparse physical laws even when other methods fail. By mitigating the distorting effects of normalisation, STCV makes sparse system identification a more reliable and automated tool for real-world applications, thereby enhancing model interpretability and trustworthiness.
Problem

Research questions and friction points this paper is trying to address.

data normalisation
sparse identification
nonlinear dynamics
SINDy
measurement noise
Innovation

Methods, ideas, or system contributions that make the work stand out.

STCV
SINDy
data-scale invariance
sparse regression
coefficient of variation
🔎 Similar Papers
No similar papers found.
J
Jay Raut
Department of Mechanical and Aeronautical Engineering, University of Pretoria, Pretoria, South Africa
D
Daniel N. Wilke
Department of Mechanical and Aeronautical Engineering, University of Pretoria, Pretoria, South Africa; School of Mechanical, Industrial and Aeronautical Engineering, University of the Witwatersrand, Johannesburg, South Africa
Stephan Schmidt
Stephan Schmidt
University of Pretoria
Vibration-based condition monitoringDiagnostics and Prognostics