Multimodal Quantum Vision Transformer for Enzyme Commission Classification from Biochemical Representations

📅 2025-08-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Enzyme function prediction—particularly for low-homology or structurally unannotated enzymes—remains a fundamental challenge in computational biology. To address this, we propose QVT, the first multimodal quantum machine learning framework integrating four complementary biochemical modalities: protein sequences, quantum-derived electronic descriptors, molecular graph structures, and 2D molecular images. Methodologically, QVT employs modality-specific encoders coupled with a cross-modal attention fusion mechanism, enabling joint representation learning of multiscale biochemical and quantum features within a unified architecture. Key technical innovations include quantum descriptor extraction, graph neural network–based molecular encoding, convolutional image feature learning, and multi-head cross-modal attention integration. On the EC number classification task, QVT achieves 85.1% top-1 accuracy, substantially outperforming unimodal baselines and state-of-the-art quantum machine learning models. This work establishes a novel, interpretable, and high-accuracy paradigm for enzyme functional annotation.

Technology Category

Application Category

📝 Abstract
Accurately predicting enzyme functionality remains one of the major challenges in computational biology, particularly for enzymes with limited structural annotations or sequence homology. We present a novel multimodal Quantum Machine Learning (QML) framework that enhances Enzyme Commission (EC) classification by integrating four complementary biochemical modalities: protein sequence embeddings, quantum-derived electronic descriptors, molecular graph structures, and 2D molecular image representations. Quantum Vision Transformer (QVT) backbone equipped with modality-specific encoders and a unified cross-attention fusion module. By integrating graph features and spatial patterns, our method captures key stereoelectronic interactions behind enzyme function. Experimental results demonstrate that our multimodal QVT model achieves a top-1 accuracy of 85.1%, outperforming sequence-only baselines by a substantial margin and achieving better performance results compared to other QML models.
Problem

Research questions and friction points this paper is trying to address.

Accurately predicting enzyme functionality from limited structural annotations
Enhancing Enzyme Commission classification using multimodal biochemical representations
Integrating quantum-derived descriptors with molecular structures for enzyme function
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal Quantum Machine Learning framework
Quantum Vision Transformer with cross-attention fusion
Integrates sequence, electronic, graph, image representations
🔎 Similar Papers
No similar papers found.
M
Murat Isik
Purdue University
M
Mandeep Kaur Saggi
NC State University
H
Humaira Gowher
Purdue University
Sabre Kais
Sabre Kais
Goodnight Distinguished Chair in Quantum Computing”, ECE at NC State
Quantum Information and Quantum Computation