🤖 AI Summary
This paper addresses the problem of diverse enumeration of conjunctive query (CQ) answers: given a CQ result set, an integer (k), and a diversity threshold (d), select (k) answers such that the pairwise distance between any two is at least (d). We formally define the Diverse-CQ decision and construction problems, unifying diversity constraints under graph-based distance measures—including Jaccard and Hamming distances. Methodologically, we develop a systematic framework grounded in database theory and parameterized complexity analysis. We prove fixed-parameter tractability (FPT) of Diverse-CQ under multiple distance metrics and design a polynomial-delay enumeration algorithm. Our contributions include tight computational complexity bounds, necessary and sufficient conditions for tractability, and practical algorithms that avoid exhaustive enumeration. The approach significantly reduces enumeration overhead and provides both theoretical foundations and implementable tools for controllable diversity selection over large-scale CQ answer sets.
📝 Abstract
Enumeration problems aim at outputting, without repetition, the set of solutions to a given problem instance. However, outputting the entire solution set may be prohibitively expensive if it is too big. In this case, outputting a small, sufficiently diverse subset of the solutions would be preferable. This leads to the Diverse-version of the original enumeration problem, where the goal is to achieve a certain level d of diversity by selecting k solutions. In this paper, we look at the Diverse-version of the query answering problem for Conjunctive Queries and extensions thereof. That is, we study the problem if it is possible to achieve a certain level d of diversity by selecting k answers to the given query and, in the positive case, to actually compute such k answers.