🤖 AI Summary
This paper addresses the realizability problem for Least Common Ancestor (LCA) constraint sets: given a collection of pairwise LCA constraints over a leaf set (X), determine whether they can be realized by a rooted tree, a directed acyclic graph (DAG), or a regular phylogenetic network. We generalize Aho et al.’s classical LCA constraint framework to DAGs and regular phylogenetic networks, introducing the *+-closure* operation and its associated canonical graph construction, and proving its completeness. We establish that DAG realizability is equivalent to consistency between the classical closure and the +-closure. We design polynomial-time algorithms for (i) deciding DAG realizability of arbitrary LCA constraint sets, (ii) constructing the unique canonical DAG when realizable, and (iii) generating a regular phylogenetic network. Our results unify and extend hierarchical inference theory, providing a rigorous, efficient formal framework for modeling non-treelike evolutionary relationships.
📝 Abstract
A least common ancestor (LCA) of two leaves in a directed acyclic graph (DAG) is a vertex that is an ancestor of both leaves and has no proper descendant that is also their common ancestor. LCAs capture hierarchical relationships in rooted trees and, more generally, in DAGs. In 1981, Aho et al. introduced the problem of determining whether a set of pairwise LCA constraints on a set $X$, of the form $(i,j)<(k,l)$ with $i,j,k,lin X$, can be realized by a rooted tree whose leaf set is $X$, such that whenever $(i,j)<(k,l)$, the LCA of $i,j$ is a descendant of that of $k,l$. They also presented a polynomial-time algorithm, BUILD, to solve this problem. However, many such constraint systems cannot be realized by any tree, prompting the question of whether they can be realized by a more general DAG. We extend Aho et al.'s framework from trees to DAGs, providing both theoretical and algorithmic foundations for reasoning about LCA constraints in this broader setting. Given a collection $R$ of LCA constraints, we define its $+$-closure $R^+$, capturing additional LCA relations implied by $R$. Using $R^+$, we construct a canonical DAG $G_R$ and prove that $R$ is DAG-realizable if and only if it is realized by $G_R$. We further adapt this construction to phylogenetic networks, defining a canonical network $N_R$ and prove that it is regular, i.e., it coincides with the Hasse diagram of its underlying set system. Finally, we show that for any DAG-realizable $R$, its classical closure - comprising all LCA constraints that hold in every DAG realizing $R$ - coincides with its $+$-closure. All constructions are computable in polynomial time, and we provide explicit algorithms for each.