Bespoke-Card: Why Tune When You Can Generate? Synthesizing Workload-Specific Cardinality Estimators

📅 2026-06-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of generic cardinality estimators, which often incur substantial errors—even under known schemas and workloads—due to their reliance on generalized statistics, leading query optimizers to produce suboptimal execution plans. To overcome this, the paper introduces Bespoke-Card, the first agent-based system that integrates code generation with feedback-driven curriculum learning to automatically synthesize high-accuracy, workload-specific cardinality estimators. Through a three-stage pipeline of planning, coding, and validation, Bespoke-Card seamlessly embeds these estimators into query optimizers without manual tuning. Leveraging a multi-agent architecture, structured q-error feedback, anomalous subplan identification, and staged training, the approach reduces PostgreSQL’s total runtime by 33% and median q-error by 41% on the JOB benchmark, while generating effective estimators in under an hour at a cost below \$10, thereby establishing a new paradigm for cardinality estimation.
📝 Abstract
Cardinality estimators are built to support arbitrary schemas and workloads, forcing them to rely on generic statistics even when the schema and workload is known in advance, leaving optimizers prone to large errors and poor plans. We present Bespoke-Card, an agent-driven system that synthesizes workload-specific cardinality estimators as executable code: a planning agent designs the estimators strategies, a coding agent implements them, and a validator scores the estimates against true cardinalities and PostgreSQL estimates, forming a robust and deterministic harness. Going beyond naive prompting, Bespoke-Card uses structured q-error feedback, regression analysis, concrete outlier subplans, a curriculum isolating join-only, filter-only, and full-subplan errors, and archival selection of the best implementation. Injecting its estimates into the optimizer cuts total PostgreSQL runtime on JOB by 33% and reduces median q-error over all JOB subplans by 41%, while synthesizing a strong estimator in under one hour for less than $10. Bespoke-Card is opening a new avenue for cardinality estimation next to classical generic estimators and learned estimator architectures.
Problem

Research questions and friction points this paper is trying to address.

cardinality estimation
query optimization
workload-specific
optimizer errors
database systems
Innovation

Methods, ideas, or system contributions that make the work stand out.

cardinality estimation
code synthesis
agent-driven system
workload-specific optimization
query optimization