MICRO: A Lightweight Middleware for Optimizing Cross-store Cross-model Graph-Relation Joins [Technical Report]

📅 2026-03-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inefficiency of cross-model join queries between graph and relational databases in heterogeneous environments by proposing a unified algebraic framework to formally express graph-relational queries. It introduces MICRO, a lightweight middleware that enables native federated execution without data transformation or materialization. The key contributions include the first unified algebra for graph-relational cross-model querying, CMLero—an optimizer based on learning-to-rank that efficiently selects execution plans without requiring precise cost estimation—and a comprehensive multi-benchmark query suite. Experimental results demonstrate that MICRO achieves up to 2.1× speedup on real-world and semi-synthetic workloads; among 93 real queries, 14 exhibit over 100× acceleration (with 4 exceeding 100×), and CMLero significantly outperforms both rule-based and regression-based optimizers.

Technology Category

Application Category

📝 Abstract
Modern data applications increasingly involve heterogeneous data managed in different models and stored across disparate database engines, often deployed as separate installs. Limited research has addressed cross-model query processing in federated environments. This paper takes a step toward bridging this gap by: (1) formally defining a class of cross-model join queries between a graph store and a relational store by proposing a unified algebra; (2) introducing one real-world benchmark and four semi-synthetic benchmarks to evaluate such queries; and (3) proposing a lightweight middleware, MICRO, for efficient query execution. At the core of MICRO is CMLero, a learning-to-rank-based query optimizer that selects efficient execution plans without requiring exact cost estimation. By avoiding the need to materialize or convert all data into a single model, which is often infeasible due to third-party data control or cost, MICRO enables native querying across heterogeneous systems. Experimental results on the benchmark workloads demonstrate that MICRO outperforms the state-of-the-art federated relational system XDB by up to 2.1x in total runtime across the full test set. On the 93 test queries of real-world benchmark, 14 queries achieve over 100 speedup, including 4 queries with more than 100x speedup; however, 4 queries experienced slowdowns of over 5 seconds, highlighting opportunities for future improvement of MICRO. Further comparisons show that CMLero consistently outperforms rule-based and regression-based optimizers, highlighting the advantage of learning-to-rank in complex cross-model optimization.
Problem

Research questions and friction points this paper is trying to address.

cross-model query
graph-relational join
federated database
heterogeneous data
query optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

cross-model query
graph-relational join
learning-to-rank optimizer
federated database middleware
heterogeneous data integration
🔎 Similar Papers
No similar papers found.
X
Xiuwen Zheng
University of California, San Diego
Arun Kumar
Arun Kumar
University of California, San Diego
ML SystemsData Analytics SystemsDatabases
A
Amarnath Gupta
University of California, San Diego