Inference of epidemic networks: the effect of different data types

📅 2025-09-01

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study investigates the sensitivity of infectious disease transmission network inference to heterogeneous observational data types—e.g., case counts, spatiotemporal locations, and viral genomic sequences. We propose a unified multi-source data integration framework that jointly leverages a generative model and probabilistic transmission tree estimation. A key methodological innovation is a Markov chain Monte Carlo (MCMC)-based transmission tree sampling algorithm, enabling robust statistical inference of latent variables—including unobserved hosts, infection timing, and tree depth. The framework integrates time-series modeling, genetic distance calibration, and dynamic network generation mechanisms. We theoretically validate its identifiability and consistency on analytically tractable models and apply it to real-world Australian SARS-CoV-2 genomic and epidemiological data. Quantitative analysis reveals distinct contributions of each data type to transmission path reconstruction and network topology inference, providing both methodological foundations and empirical evidence for early-phase, data-driven outbreak modeling.

Technology Category

Application Category

📝 Abstract

We investigate how the properties of epidemic networks change depending on the availability of different types of data on a disease outbreak. This is achieved by introducing mathematical and computational methods that estimate the probability of transmission trees by combining generative models that jointly determine the number of infected hosts, the probability of infection between them depending on location and genetic information, and their time of infection and sampling. We introduce a suitable Markov Chain Monte Carlo method that we show to sample trees according to their probability. Statistics performed over the sampled trees lead to probabilistic estimations of network properties and other quantities of interest, such as the number of unobserved hosts and the depth of the infection tree. We confirm the validity of our approach by comparing the numerical results with analytically solvable examples. Finally, we apply our methodology to data from COVID-19 in Australia. We find that network properties that are important for the management of the outbreak depend sensitively on the type of data used in the inference.

Problem

Research questions and friction points this paper is trying to address.

Infer epidemic networks using different data types

Estimate transmission tree probabilities with generative models

Assess network properties' sensitivity to data availability

Innovation

Methods, ideas, or system contributions that make the work stand out.

Combining generative models with location and genetic data

Using Markov Chain Monte Carlo to sample transmission trees

Estimating network properties and unobserved hosts probabilistically

🔎 Similar Papers

No similar papers found.

Authors to Follow