π€ AI Summary
Existing quantum autoencoders lack a masking learning mechanism, limiting their ability to effectively reconstruct representations under local image occlusions. To address this, we propose the Quantum Masked Autoencoder (QMAE)βthe first quantum adaptation of the classical masked autoencoding paradigm. QMAE employs a differentiable quantum circuit to jointly learn masked reconstruction and discriminative representation in quantum state space. Leveraging quantum superposition and entanglement, it enhances feature robustness against partial information loss. Evaluated on MNIST, QMAE achieves an average 12.86% improvement in downstream classification accuracy over state-of-the-art quantum autoencoders, along with a 9.3% gain in peak signal-to-noise ratio (PSNR) for reconstructed images. These results demonstrate substantial improvements in visual fidelity and generalization capability across downstream tasks. This work establishes a novel paradigm for quantum vision representation learning, bridging foundational principles of self-supervised learning with quantum computation.
π Abstract
Classical autoencoders are widely used to learn features of input data. To improve the feature learning, classical masked autoencoders extend classical autoencoders to learn the features of the original input sample in the presence of masked-out data. While quantum autoencoders exist, there is no design and implementation of quantum masked autoencoders that can leverage the benefits of quantum computing and quantum autoencoders. In this paper, we propose quantum masked autoencoders (QMAEs) that can effectively learn missing features of a data sample within quantum states instead of classical embeddings. We showcase that our QMAE architecture can learn the masked features of an image and can reconstruct the masked input image with improved visual fidelity in MNIST images. Experimental evaluation highlights that QMAE can significantly outperform (12.86% on average) in classification accuracy compared to state-of-the-art quantum autoencoders in the presence of masks.