Quantum Occam Learning: Sample-Supported Expressibility for Circuit-Based Quantum Learning

📅 2026-06-10

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of appropriately measuring and controlling the expressivity of quantum circuits under limited sample regimes to avoid overfitting unknown quantum states. It introduces the first quantum Occam learning framework tailored for circuit-constrained quantum learning, treating circuit complexity as a statistical resource that can be adaptively tuned based on data. By leveraging metric entropy analysis, trace distance approximation, and oracle inequalities within an agnostic learning setting, the study derives generalization bounds that quantitatively relate sample size to representational capacity: with M samples, one can reliably support at most approximately Mε² two-qubit gates. The analysis yields tight upper and lower bounds, thereby achieving an adaptive balance between model complexity and data-driven constraints.

📝 Abstract

A central principle in quantum machine learning is that an ansatz should be expressive enough to represent the quantum data of interest. Yet, the expressibility is statistically meaningful only insofar as it can be learned from finitely many copies of an unknown quantum state. In this work, we develop an information-theoretic Occam theory for quantum data generated by finite-size quantum circuits. For the class $S_{n,G}$ of $n$-qubit pure states preparable with at most $G$ two-qubit gates, a metric-entropy argument gives the realizable sample law $\widetildeΘ(G/ε^2)$ in the circuit-limited regime. For an arbitrary source $\hatρ$, we introduce the best $G$-gate approximation error $d_G(\hatρ)$ and the approximate circuit complexity $C_η(\hatρ)$. We prove an agnostic quantum Occam theorem: with $M$ copies, one can learn up to the best $G$-gate approximation error plus a statistical penalty $\widetilde{O}(\sqrt{G/M})$. We then remove the need to know $G$ in advance through an adaptive model-selection theorem whose oracle inequality selects the circuit complexity justified by the data. Matching lower bounds yield a sample-supported expressibility law: at trace-distance accuracy $ε$, $M$ samples can support only $G_{\rm supported} \simeq Mε^2$ gates, up to logarithmic factors and tomography saturation at $2^n$. Thus, the circuit complexity becomes an adaptive statistical resource rather than a static promise. Our framework turns bounded circuit complexity into a model-selection principle for quantum machine learning.

Problem

Research questions and friction points this paper is trying to address.

quantum machine learning

expressibility

circuit complexity

sample complexity

model selection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Quantum Occam Learning

circuit complexity

sample complexity