🤖 AI Summary
This work addresses the high measurement cost of gradient estimation in hardware-efficient training of parameterized quantum circuits, where the parameter-shift rule incurs a linear overhead in the number of parameters, limiting scalability. The authors propose an unbiased gradient estimation framework based on forward-mode automatic differentiation, which efficiently approximates gradients by averaging directional derivatives along a small number of random directions—requiring neither ancillary qubits nor controlled gates. This framework unifies stochastic perturbation methods such as SPSA, random coordinate descent, and the parameter-shift rule, and introduces QUIVER, an adaptive optimizer that optimally allocates measurement resources. Theoretical analysis establishes convergence guarantees for stochastic quantum forward gradient descent. Experiments demonstrate successful training of a 60-qubit, 1,770-parameter quantum neural network on ECG5000 and MNIST datasets, achieving orders-of-magnitude speedup over the parameter-shift rule; QUIVER also significantly outperforms iCANS and gCANS in VQE and QAOA tasks.
📝 Abstract
Training parameterised quantum circuits (PQCs) on quantum hardware is bottlenecked by the measurement cost of gradient estimation, which under the parameter-shift rule scales linearly in the number of trainable parameters and dominates the total shot budget of training at scale. In this work, we propose a framework of forward gradient estimators for PQCs, based on the forward mode of automatic differentiation, that yields an unbiased estimator of the gradient by averaging a freely tunable number of random directional derivatives and recovers SPSA, random coordinate descent, and the parameter-shift rule as limiting cases, with no ancilla qubits or controlled-gate overhead. We prove that stochastic quantum forward gradient descent converges under standard assumptions, with an explicit second-moment expansion that interpolates between the single-direction extreme of SPSA and the full-gradient extreme of parameter-shift. Within this framework we derive QUIVER (Quantum Iterative V-adaptive Estimator Rule), an adaptive optimiser for parameterised circuits whose update rule follows from a closed-form minimum measurement-cost allocation. We show numerically that forward gradients train Hamming-weight-preserving orthogonal quantum neural networks with up to 60 qubits and 1770 parameters on the ECG5000 and MNIST datasets orders of magnitude more efficiently than the parameter-shift rule. We also demonstrate that our proposed QUIVER optimiser can outperform iCANS and gCANS measurement-frugal optimisers on optimisation problems using the quantum approximate optimisation algorithm and quantum simulation with the variational quantum eigensolver.