Interpretable model-free inference of parametric variation across time-series data through large-scale feature extraction

📅 2026-06-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work proposes a data-driven approach to directly infer the dimensionality and nature of parameter variations in generative processes from time series data without requiring predefined models. By constructing a large-scale feature library comprising over 7,000 interpretable features and integrating unsupervised dimensionality reduction with low-dimensional manifold analysis, the method identifies latent parameter degrees of freedom underlying individual differences, independent of the specific system equations. The framework accurately reconstructs ground-truth parameter variations across 13 classes of simulated systems, encompassing linear, nonlinear, and chaotic dynamics. Furthermore, it successfully reveals biologically interpretable components in locomotor data from 1,143 Drosophila melanogaster, demonstrating significant associations with sex and circadian rhythm.

📝 Abstract

Here we address the problem of estimating the dimensionality and nature of parametric variation in an unknown generative process directly from time-series data, without specifying or fitting a model. In particular we suppose that inter-instance variation in collections of time series is caused by parametric variation in the generating model. We hypothesize that, given a sufficiently large library of time-series features, low-dimensional parametric variation will manifest as low-dimensional structure in feature space, enabling interpretable estimators of the underlying degrees of freedom to be constructed. We test our hypothesis using a library of over 7000 diverse and interpretable time-series statistics and thirteen simulated systems with known parametric variation, spanning linear stochastic processes, nonlinear oscillators, and chaotic dynamics. Our unsupervised, data-driven approach often reconstructs the underlying parametric variation across this extensive range of simulated dynamical systems while also yielding interpretable estimators for each underlying dimension. Applied to the movement dynamics of 1143 fruit flies, we use this method to extract biologically meaningful components corresponding to sex and circadian rhythmicity. Our results pave the way for much-needed data-driven methods to bridge the gap between interpretable theoretical understanding of dynamics and the large and complex datasets that characterize modern scientific problems.

Problem

Research questions and friction points this paper is trying to address.

parametric variation

time-series data

model-free inference

dimensionality estimation

interpretable inference

Innovation

Methods, ideas, or system contributions that make the work stand out.

model-free inference

time-series features

parametric variation