Inferring DAGs and Phylogenetic Networks from Least Common Ancestors

📅 2025-11-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the realizability problem for Least Common Ancestor (LCA) constraint sets: given a collection of pairwise LCA constraints over a leaf set (X), determine whether they can be realized by a rooted tree, a directed acyclic graph (DAG), or a regular phylogenetic network. We generalize Aho et al.’s classical LCA constraint framework to DAGs and regular phylogenetic networks, introducing the *+-closure* operation and its associated canonical graph construction, and proving its completeness. We establish that DAG realizability is equivalent to consistency between the classical closure and the +-closure. We design polynomial-time algorithms for (i) deciding DAG realizability of arbitrary LCA constraint sets, (ii) constructing the unique canonical DAG when realizable, and (iii) generating a regular phylogenetic network. Our results unify and extend hierarchical inference theory, providing a rigorous, efficient formal framework for modeling non-treelike evolutionary relationships.

Technology Category

Application Category

📝 Abstract
A least common ancestor (LCA) of two leaves in a directed acyclic graph (DAG) is a vertex that is an ancestor of both leaves and has no proper descendant that is also their common ancestor. LCAs capture hierarchical relationships in rooted trees and, more generally, in DAGs. In 1981, Aho et al. introduced the problem of determining whether a set of pairwise LCA constraints on a set $X$, of the form $(i,j)<(k,l)$ with $i,j,k,lin X$, can be realized by a rooted tree whose leaf set is $X$, such that whenever $(i,j)<(k,l)$, the LCA of $i,j$ is a descendant of that of $k,l$. They also presented a polynomial-time algorithm, BUILD, to solve this problem. However, many such constraint systems cannot be realized by any tree, prompting the question of whether they can be realized by a more general DAG. We extend Aho et al.'s framework from trees to DAGs, providing both theoretical and algorithmic foundations for reasoning about LCA constraints in this broader setting. Given a collection $R$ of LCA constraints, we define its $+$-closure $R^+$, capturing additional LCA relations implied by $R$. Using $R^+$, we construct a canonical DAG $G_R$ and prove that $R$ is DAG-realizable if and only if it is realized by $G_R$. We further adapt this construction to phylogenetic networks, defining a canonical network $N_R$ and prove that it is regular, i.e., it coincides with the Hasse diagram of its underlying set system. Finally, we show that for any DAG-realizable $R$, its classical closure - comprising all LCA constraints that hold in every DAG realizing $R$ - coincides with its $+$-closure. All constructions are computable in polynomial time, and we provide explicit algorithms for each.
Problem

Research questions and friction points this paper is trying to address.

Extending LCA constraint realization from trees to DAGs
Developing canonical DAG construction for LCA constraints
Adapting LCA framework to phylogenetic network structures
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends LCA constraints from trees to DAGs
Constructs canonical DAG using closure of constraints
Provides polynomial-time algorithms for network construction
🔎 Similar Papers
No similar papers found.
A
Anna Lindeberg
Department of Mathematics, Faculty of Science, Stockholm University, SE-10691 Stockholm, Sweden
A
Anton Alfonsson
Department of Mathematics, Faculty of Science, Stockholm University, SE-10691 Stockholm, Sweden
V
Vincent Moulton
School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, United Kingdom
G
G. Scholz
Independent Researcher, Leipzig, DE-04229, Germany
Marc Hellmuth
Marc Hellmuth
Associate Professor, Stockholm University
discrete mathematicsalgorithmscomputational biologybiomathematicsdata science