🤖 AI Summary
Evaluating the suitability of reinforcement learning (RL) algorithms for quantum architecture search (QAS) remains challenging due to the lack of standardized, task-diverse benchmarks across noise regimes and qubit scales.
Method: We introduce the first unified benchmark framework for QAS, covering variational quantum algorithms, quantum classification, and state preparation on 2–8 qubits under both noisy and noiseless conditions. We propose a weighted multi-objective ranking metric jointly optimizing accuracy, circuit depth, gate count, and computational efficiency. Experiments span nine representative RL agents—including value-based and policy-gradient methods—integrated with realistic noise simulation and differentiable quantum circuit design.
Contribution/Results: Our empirical analysis validates the “no-free-lunch” theorem in QAS: no single RL algorithm dominates across all tasks and settings. RL-based quantum classifiers consistently outperform classical and heuristic baselines, yet performance is highly task-dependent. All code, benchmarks, and datasets are publicly released to enable reproducible quantum machine learning research.
📝 Abstract
We introduce BenchRL-QAS, a unified benchmarking framework for systematically evaluating reinforcement learning (RL) algorithms in quantum architecture search (QAS) across diverse variational quantum algorithm tasks and system sizes ranging from 2- to 8-qubit. Our study benchmarks nine RL agents including both value-based and policy-gradient methods on representative quantum problems such as variational quantum eigensolver, variational quantum state diagonalization, quantum classification, and state preparation, spanning both noiseless and realistic noisy regimes. We propose a weighted ranking metric that balances accuracy, circuit depth, gate count, and computational efficiency, enabling fair and comprehensive comparison. Our results first reveal that RL-based quantum classifier outperforms baseline variational classifiers. Then we conclude that no single RL algorithm is universally optimal when considering a set of QAS tasks; algorithmic performance is highly context-dependent, varying with task structure, qubit count, and noise. This empirical finding provides strong evidence for the "no free lunch" principle in RL-based quantum circuit design and highlights the necessity of tailored algorithm selection and systematic benchmarking for advancing quantum circuit synthesis. This work represents the most comprehensive RL-QAS benchmarking effort to date, and BenchRL-QAS along with all experimental data are made publicly available to support reproducibility and future research https://github.com/azhar-ikhtiarudin/bench-rlqas.