🤖 AI Summary
This study addresses the challenge of intractable marginal likelihood computation in stochastic compartmental models for epidemic surveillance due to partially observed data. Within the dynamical survival analysis (DSA) framework, the authors derive, for the first time, a closed-form marginal likelihood tailored to discrete incidence count data. The approach integrates a large-population limit approximation with a counting-process-based stochastic model, achieving both computational efficiency and modeling flexibility. It naturally extends to frailty models incorporating individual heterogeneity in susceptibility. Experiments on simulated and real-world datasets—including Ebola and SARS-CoV-2 variants—demonstrate that the method yields parameter estimates nearly as accurate as those from exact but computationally prohibitive benchmark approaches, while enabling practical applications in modeling transmission dynamics and susceptibility distributions.
📝 Abstract
Stochastic compartmental models are prevalent tools for describing disease spread, but inference under these models is challenging for many types of surveillance data when the marginal likelihood function becomes intractable due to missing information. To address this, we develop a closed-form likelihood for discretely observed incidence count data under the dynamical survival analysis (DSA) paradigm. The method approximates the stochastic population-level hazard by a large population limit while retaining a count-valued stochastic model, and leads to survival analytic inferential strategies that are both computationally efficient and flexible to model generalizations. Through simulation, we show that parameter estimation is competitive with recent exact but computationally expensive likelihood-based methods in partially observed settings. Previous work has shown that the DSA approximation is generalizable, and we show that the inferential developments here also carry over to models featuring individual heterogeneity, such as frailty models. We consider case studies of both Ebola and COVID-19 data on variants of the model, including a network-based epidemic model and a model with distributions over susceptibility, demonstrating its flexibility and practical utility on real, partially observed datasets.