🤖 AI Summary
Limited genomic sequencing coverage of pathogens constrains spatiotemporal transmission modeling. Method: We propose a time-aware probabilistic framework that infers genetic distances between unsequenced cases and sequenced samples without sequence alignment or known transmission chains. The method integrates sampling-time information with a molecular clock model, enabling Bayesian inference of evolutionary distances and incorporating an uncertainty-aware missing-data imputation mechanism. Contribution/Results: To our knowledge, this is the first approach to embed evolutionary divergence patterns directly into spatiotemporal modeling, achieving end-to-end probabilistic imputation of genetic distances. Validated on U.S. wild-bird H5N1 avian influenza outbreaks in poultry, the method significantly improves genomic data utilization, enhances integration of evolutionary signals into spatiotemporal models, and increases the accuracy of transmission inference.
📝 Abstract
Pathogen genome data offers valuable structure for spatial models, but its utility is limited by incomplete sequencing coverage. We propose a probabilistic framework for inferring genetic distances between unsequenced cases and known sequences within defined transmission chains, using time-aware evolutionary distance modeling. The method estimates pairwise divergence from collection dates and observed genetic distances, enabling biologically plausible imputation grounded in observed divergence patterns, without requiring sequence alignment or known transmission chains. Applied to highly pathogenic avian influenza A/H5 cases in wild birds in the United States, this approach supports scalable, uncertainty-aware augmentation of genomic datasets and enhances the integration of evolutionary information into spatiotemporal modeling workflows.