Extremal Fitting Problems for Conjunctive Queries

📅 2022-06-10
🏛️ ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems
📈 Citations: 14
Influential: 1
📄 PDF
🤖 AI Summary
This paper investigates the fitting problem for conjunctive queries (CQs) and their variants—tree CQs and unions of CQs (UCQs): given labeled positive and negative data instances, construct a query that correctly classifies them. To address solution non-uniqueness, we formally define and systematically characterize three extremal solution classes: the most general, the most specific, and the unique fitting CQ. We establish novel, deep algebraic connections between these extremal solutions and homomorphism duality, frontier structures, and direct products—unifying treatment across CQs, tree CQs, and UCQs. We fully characterize the existence conditions and structural properties of each extremal class, precisely determine the computational complexity of existence and verification (ranging over P, NP, and Π₂^p-completeness), and provide tight size bounds for fitting CQs.
📝 Abstract
The fitting problem for conjunctive queries (CQs) is the problem to construct a CQ that fits a given set of labeled data examples. When a fitting CQ exists, it is in general not unique. This leads us to proposing natural refinements of the notion of a fitting CQ, such as most-general fitting CQ, most-specific fitting CQ, and unique fitting CQ. We give structural characterizations of these notions in terms of (suitable refinements of) homomorphism dualities, frontiers, and direct products, which enable the construction of the refined fitting CQs when they exist. We also pinpoint the complexity of the associated existence and verification problems, and determine the size of fitting CQs. We study the same problems for UCQs and for the more restricted class of tree CQs.
Problem

Research questions and friction points this paper is trying to address.

Constructing conjunctive queries that fit labeled data examples
Refining fitting queries using homomorphism dualities and frontiers
Analyzing complexity and size for UCQs and tree CQs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Refining fitting CQs using homomorphism dualities
Characterizing CQ structures through frontiers and products
Analyzing complexity and size for UCQs and tree CQs
🔎 Similar Papers
2024-04-15Annual Meeting of the Association for Computational LinguisticsCitations: 4
B
B. T. Cate
ILLC, University of Amsterdam, The Netherlands
V
V. Dalmau
Universitat Pompeu Fabra, Spain
Maurice Funk
Maurice Funk
Leipzig University
Description LogicsKnowledge RepresentationLearningLogic in Computer Science
C
C. Lutz
Universität Leipzig and ScaDS.AI Center Dresden/Leipzig, Germany