Conformal Prediction for Verifiable Learned Query Optimization

📅 2025-05-04

📈 Citations: 0

✨ Influential: 0

career value

171K/year

🤖 AI Summary

This work addresses key trustworthiness bottlenecks—opacity, unpredictable performance, and poor robustness to distributional shift—that hinder the practical deployment of learned query optimizers (LQOs). We introduce conformal prediction (CP) to query optimization validation for the first time, proposing a CP-driven LQO verification framework that supports pre-execution tunable latency upper-bound estimation, real-time violation detection during execution, and adaptive calibration under distributional shift. We further devise the first CP-guided heuristic plan search mechanism. Integrated into mainstream LQOs—including Balsa, Lero, and RTOS—our approach achieves, on JOB and TPC-H benchmarks: tighter latency bounds (32.7% lower average error), >98.5% violation detection rate, 41.2% improvement in confidence stability, up to 9.84× higher plan quality, 74.4% reduction in per-query planning time, and 9.96% end-to-end query latency reduction.

Technology Category

Application Category

📝 Abstract

Query optimization is critical in relational databases. Recently, numerous Learned Query Optimizers (LQOs) have been proposed, demonstrating superior performance over traditional hand-crafted query optimizers after short training periods. However, the opacity and instability of machine learning models have limited their practical applications. To address this issue, we are the first to formulate the LQO verification as a Conformal Prediction (CP) problem. We first construct the CP model and obtain user-controlled bounded ranges for the actual latency of LQO plans before execution. Then, we introduce CP-based runtime verification along with violation handling to ensure performance prior to execution. For both scenarios, we further extend our framework to handle distribution shifts in the dynamic environment using adaptive CP approaches. Finally, we present CP-guided plan search, which uses actual latency upper bounds from CP to heuristically guide query plan construction. We integrated our verification framework into three LQOs (Balsa, Lero, and RTOS) and conducted evaluations on the JOB and TPC-H workloads. Experimental results demonstrate that our method is both accurate and efficient. Our CP-based approaches achieve tight upper bounds, reliably detect and handle violations. Adaptive CP maintains accurate confidence levels even in the presence of distribution shifts, and the CP-guided plan search improves both query plan quality (up to 9.84x) and planning time, with a reduction of up to 74.4% for a single query and 9.96% across all test queries from trained LQOs.

Problem

Research questions and friction points this paper is trying to address.

Ensuring verifiable performance bounds for Learned Query Optimizers (LQOs).

Addressing opacity and instability of machine learning models in LQOs.

Handling distribution shifts in dynamic environments for LQO verification.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Conformal Prediction for LQO verification

CP-based runtime violation handling

Adaptive CP for distribution shifts

🔎 Similar Papers

Conformalized Strategy-Proof Auctions