Improving Enzyme Prediction with Chemical Reaction Equations by Hypergraph-Enhanced Knowledge Graph Embeddings

📅 2026-01-08
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods for predicting enzyme–substrate interactions are hindered by sparse and labor-intensive expert-annotated data, limiting model generalization. To address this, this work proposes Hyper-Enz, a novel approach that leverages readily available chemical reaction equations to construct (reactant, enzyme, product) triplets and builds a knowledge graph enhanced with hypergraph representations. By integrating a hypergraph Transformer with a mixture-of-experts learning mechanism, Hyper-Enz effectively captures the complex relationships among multiple reactants, products, and enzymes. The model achieves substantial performance gains: an 88% relative improvement in accuracy on enzyme retrieval tasks and a 30% gain in enzyme–substrate pairing prediction, significantly outperforming conventional approaches.

Technology Category

Application Category

📝 Abstract
Predicting enzyme-substrate interactions has long been a fundamental problem in biochemistry and metabolic engineering. While existing methods could leverage databases of expert-curated enzyme-substrate pairs for models to learn from known pair interactions, the databases are often sparse, i.e., there are only limited and incomplete examples of such pairs, and also labor-intensive to maintain. This lack of sufficient training data significantly hinders the ability of traditional enzyme prediction models to generalize to unseen interactions. In this work, we try to exploit chemical reaction equations from domain-specific databases, given their easier accessibility and denser, more abundant data. However, interactions of multiple compounds, e.g., educts and products, with the same enzymes create complex relational data patterns that traditional models cannot easily capture. To tackle that, we represent chemical reaction equations as triples of (educt, enzyme, product) within a knowledge graph, such that we can take advantage of knowledge graph embedding (KGE) to infer missing enzyme-substrate pairs for graph completion. Particularly, in order to capture intricate relationships among compounds, we propose our knowledge-enhanced hypergraph model for enzyme prediction, i.e., Hyper-Enz, which integrates a hypergraph transformer with a KGE model to learn representations of the hyper-edges that involve multiple educts and products. Also, a multi-expert paradigm is introduced to guide the learning of enzyme-substrate interactions with both the proposed model and chemical reaction equations. Experimental results show a significant improvement, with up to a 88% relative improvement in average enzyme retrieval accuracy and 30% improvement in pair-level prediction compared to traditional models, demonstrating the effectiveness of our approach.
Problem

Research questions and friction points this paper is trying to address.

enzyme-substrate interaction
data sparsity
chemical reaction equations
knowledge graph
hypergraph
Innovation

Methods, ideas, or system contributions that make the work stand out.

hypergraph
knowledge graph embedding
enzyme prediction
chemical reaction equations
multi-expert paradigm
🔎 Similar Papers
No similar papers found.