Bayesian Inference for Epidemic Final Size Datasets with Hidden Underlying Household Structure

📅 2026-03-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the limitations of traditional household secondary attack rate (SAR) estimates, which are constrained by specific household size distributions and hinder generalizability, while low-resolution epidemiological data impede accurate inference of transmission parameters. To overcome these challenges, the authors propose a Bayesian inference framework that integrates a mechanistic transmission model with household size distributions. Leveraging Markov chain Monte Carlo (MCMC) algorithms, the method imputes latent household structures from multi-resolution epidemiological data, thereby enabling, for the first time, a fully Bayesian modeling approach to unobserved household configurations. The framework substantially reduces uncertainty in transmission parameter estimation, achieving over 95% coverage in synthetic data experiments and demonstrating that stratifying SAR by household size markedly improves estimation accuracy when applied to real-world SARS-CoV-2 household transmission data.

Technology Category

Application Category

📝 Abstract
Households represent a key unit of interest in infectious disease epidemiology, in both empirical studies and mathematical modelling. The within-household transmission potential of a disease is often summarised by a secondary attack ratio (SAR). Despite its widespread use, the SAR depends on the household size distribution (HHSD) seen during the study period, making it difficult to generalise to new contexts. Extending estimates of transmission potential to new populations instead requires estimates of person-to-person transmission rates which can be convoluted with data on population structure to parametrise mechanistic transmission models. In this study we present a new Bayesian inference method which uses an MCMC algorithm to infer the transmission intensity by imputing the unreported household structure underlying the epidemic. This method can be run on household epidemiological data reported at varying levels of resolution. For synthetic data from a realistic underlying HHSD, we were able to achieve over 95% coverage in our estimates of transmission rate consistently. We were also able to consistently achieve over 95% coverage for data generated with a pathological underlying HHSD, given strong information about the HHSD. Using an existing dataset which recorded micro-scale household epidemiological outcomes during the COVID-19 pandemic, we show that stratifying observed SARs by household size substantially reduces the uncertainty in estimates. Our findings suggest that researchers conducting household epidemiological studies can improve the utility of results for infectious disease modellers by reporting household-stratified estimates. These results aim to encourage the reporting of higher resolution outputs in epidemiological field work as, in the absence of strong priors, transmission parameters were not easily identifiable from low resolution datasets, which are often reported.
Problem

Research questions and friction points this paper is trying to address.

household structure
secondary attack ratio
transmission rate
epidemic final size
data resolution
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian inference
household structure imputation
transmission rate estimation
secondary attack ratio
MCMC algorithm
🔎 Similar Papers
No similar papers found.
J
Joseph Brooks
Department of Mathematics, University of Manchester, Manchester, United Kingdom
Thomas House
Thomas House
Department of Mathematics, University of Manchester
MathematicsStatisticsHealth
Lorenzo Pellis
Lorenzo Pellis
Sir Henry Dale Fellow, The University of Manchester
Epidemic modelling
J
Joe Hilton
Department of Mathematics, University of Manchester, Manchester, United Kingdom; Manchester Centre for Health Economics, University of Manchester, Manchester, United Kingdom