🤖 AI Summary
This work addresses the lack of systematic design principles in quantum machine learning, which often necessitates extensive trial-and-error to identify effective model configurations. To this end, the authors propose the Quantum Bias–Expressivity Toolbox (QBET), which extends the Simplicity Bias (SB) metric—previously limited to discriminative tasks—to generative and multi-class settings for the first time. By integrating SB with Expressivity (EXP), QBET establishes a cross-architectural evaluation framework that enables efficient pre-screening of quantum, classical, and hybrid Transformer models without requiring full training. Applied to a quantum self-attention model with 18 qubits (6 each for query, key, and value), QBET successfully identifies multiple variants that outperform their classical counterparts, demonstrating its effectiveness and practical utility.
📝 Abstract
Quantum machine learning models generally lack principled design guidelines, often requiring full resource-intensive training across numerous choices of encodings, quantum circuit designs and initialization strategies to find effective configuration. To address this challenge, we develope the Quantum Bias-Expressivity Toolbox ($\texttt{QBET}$), a framework for evaluating quantum, classical, and hybrid transformer architectures. In this toolbox, we introduce lean metrics for Simplicity Bias ($\texttt{SB}$) and Expressivity ($\texttt{EXP}$), for comparing across various models, and extend the analysis of $\texttt{SB}$ to generative and multiclass-classification tasks. We show that $\texttt{QBET}$ enables efficient pre-screening of promising model variants obviating the need to execute complete training pipelines. In evaluations on transformer-based classification and generative tasks we employ a total of $18$ qubits for embeddings ($6$ qubits each for query, key, and value). We identify scenarios in which quantum self-attention variants surpass their classical counterparts by ranking the respective models according to the $\texttt{SB}$ metric and comparing their relative performance.