🤖 AI Summary
Accurately predicting the population burden of vector-borne diseases remains challenging due to limited surveillance data and nonlinear memory effects. This study proposes a novel framework that integrates distributional memory dynamical systems with an extended sparse identification of nonlinear dynamics (SINDy) approach, marking the first application of memory-aware sparse identification to vector-borne disease modeling. Using only time series of human incidence and temperature, the method enables data-driven discovery of interpretable, integral-form transmission mechanisms. Demonstrated on severe fever with thrombocytopenia syndrome (SFTS), the approach substantially improves predictive accuracy and facilitates systematic sensitivity analysis of memory kernels and behavioral parameters, offering public health authorities a scalable, mechanistically transparent forecasting tool.
📝 Abstract
Predicting the human burden of vector-borne diseases from limited surveillance data remains a major challenge, particularly in the presence of nonlinear transmission dynamics and delayed effects arising from vector ecology and human behavior. We develop a data-driven framework based on an extension of Sparse Identification of Nonlinear Dynamics (SINDy) to systems with distributed memory, enabling discovery of transmission mechanisms directly from time series data. Using severe fever with thrombocytopenia syndrome (SFTS) as a case study, we show that this approach can uncover key features of tick-borne disease dynamics using only human incidence and local temperature data, without imposing predefined assumptions on human case reporting. We further demonstrate that predictive performance is substantially enhanced when the data-driven model is coupled with mechanistic representations of tick-host transmission pathways informed by empirical studies. The framework supports systematic sensitivity analysis of memory kernels and behavioral parameters, identifying those most influential for prediction accuracy. Although the approach prioritizes predictive accuracy over mechanistic transparency, it yields sparse, interpretable integral representations suitable for epidemiological forecasting. This hybrid methodology provides a scalable strategy for forecasting vector-borne disease risk and informing public health decision-making under data limitations.