🤖 AI Summary
This work addresses the challenge of efficiently approximating posterior distributions in Bayesian ultrametric phylogenetic inference. We propose the first end-to-end differentiable variational inference method that fully covers the ultrametric tree space. Our core contribution is a novel variational family parameterized by coalescent times from single-linkage clustering, which yields the first closed-form density function on the ultrametric tree space. By abandoning traditional MCMC sampling, we enable fully differentiable variational optimization. The method integrates coalescent population genetic models, a bipartite product structure over tree partitions, and gradient-based optimization. Evaluated on benchmark genomic datasets and SARS-CoV-2 phylogenies, it achieves state-of-the-art accuracy while reducing gradient evaluations by approximately 50%, significantly improving computational efficiency.
📝 Abstract
Bayesian phylogenetics requires accurate and efficient approximation of posterior distributions over trees. In this work, we develop a variational Bayesian approach for ultrametric phylogenetic trees. We present a novel variational family based on coalescent times of a single-linkage clustering and derive a closed-form density of the resulting distribution over trees. Unlike existing methods for ultrametric trees, our method performs inference over all of tree space, it does not require any Markov chain Monte Carlo subroutines, and our variational family is differentiable. Through experiments on benchmark genomic datasets and an application to SARS-CoV-2, we demonstrate that our method achieves competitive accuracy while requiring significantly fewer gradient evaluations than existing state-of-the-art techniques.