🤖 AI Summary
Heterogeneous storage systems face significant auto-tuning challenges, including vast configuration spaces, dynamic workloads and deployment environments, poor generalizability of existing methods, and heavy reliance on manual intervention. To address these, we propose the first LLM-based agent framework for cross-storage-system tuning. Our method introduces a novel “Execute–Extract–Search–Reflect” four-function decoupled architecture, integrating insight-driven tree search with a hierarchical memory mechanism to enable cross-system knowledge reuse and safety-aware validation. The framework incorporates sandboxed benchmarking, performance summary extraction, and a lightweight safety checker. Evaluated on RocksDB and MySQL, our approach achieves up to 575% higher throughput and 88% lower p99 latency versus default configurations, and outperforms ELMo-Tune by 111% in throughput while substantially reducing convergence iterations.
📝 Abstract
Automatically configuring storage systems is hard: parameter spaces are large and conditions vary across workloads, deployments, and versions. Heuristic and ML tuners are often system specific, require manual glue, and degrade under changes. Recent LLM-based approaches help but usually treat tuning as a single-shot, system-specific task, which limits cross-system reuse, constrains exploration, and weakens validation. We present StorageXTuner, an LLM agent-driven auto-tuning framework for heterogeneous storage engines. StorageXTuner separates concerns across four agents - Executor (sandboxed benchmarking), Extractor (performance digest), Searcher (insight-guided configuration exploration), and Reflector (insight generation and management). The design couples an insight-driven tree search with layered memory that promotes empirically validated insights and employs lightweight checkers to guard against unsafe actions. We implement a prototype and evaluate it on RocksDB, LevelDB, CacheLib, and MySQL InnoDB with YCSB, MixGraph, and TPC-H/C. Relative to out-of-the-box settings and to ELMo-Tune, StorageXTuner reaches up to 575% and 111% higher throughput, reduces p99 latency by as much as 88% and 56%, and converges with fewer trials.