🤖 AI Summary
Existing methods struggle to effectively measure distances between unlabeled, time-stamped phylogenetic networks, particularly lacking a unified framework for comparing ancestral histories involving reticulate evolutionary events such as hybridization or recombination. This work proposes a novel distance metric based on bijective triangular matrix representations, introducing matrix norms—such as the Frobenius norm—for the first time into phylogenetic network comparison. The approach jointly captures topological structure, temporal information, and reticulation events in a unified manner, accommodating both contemporaneous and heterochronous sampling as well as networks of varying complexity. Evaluated on simulated data and empirical posterior distributions of viral phylogenies, the method accurately discerns biologically meaningful evolutionary differences, enabling efficient and interpretable quantitative comparisons and thereby addressing a critical methodological gap in analyzing complex networks like ancestral recombination graphs.
📝 Abstract
Phylogenetic networks are graphs inferred from molecular sequence data that represent ancestral histories shaped by reticulate processes such as recombination, hybridization, and horizontal gene transfer. We introduce a family of distance metrics for rooted, ranked, unlabeled phylogenetic networks, extending a previously developed distance for ranked trees. Our approach relies on a bijective triangular matrix representation of phylogenetic networks that captures the temporal order of internal events, speciations, and hybridizations. Our metrics, defined as standard matrix norms, allow efficient quantitative comparisons of network topologies, timed networks and networks with differing numbers of hybridizations. Our distance can be used for both isochronous networks where all tips are sampled at one time point, and heterochronous networks where tips are allowed to be sampled at different time points. We show that our metrics capture biologically meaningful differences among evolutionary histories in both simulations and empirical posterior distributions of viral phylogenetic networks. These tools fill a methodological gap, enabling principled comparisons of ranked, unlabeled phylogenetic networks, including ancestral recombination graphs.