🤖 AI Summary
Predicting gene knockout strategies for growth-coupled production (GCP) in genome-scale metabolic models (GEMs) remains challenging due to computational complexity and reliance on iterative optimization.
Method: We propose the first end-to-end deep learning framework that jointly encodes genes, metabolites, and pathways—integrating prior biological knowledge with stoichiometric and thermodynamic constraints—via a hybrid architecture combining graph neural networks (GNNs) and sequence modeling. This enables multi-scale feature learning without explicit mathematical programming.
Contribution/Results: Unlike solver-dependent or rule-based approaches, our framework enables fully automated, optimization-free strategy generation. Evaluated on three GEMs of varying scales, it achieves 17.64–27.15% higher accuracy than state-of-the-art baselines, with balanced precision and recall. The implementation is publicly available.
📝 Abstract
In genome-scale constraint-based metabolic models, gene deletion strategies are crucial for achieving growth-coupled production, where cell growth and target metabolite production are simultaneously achieved. While computational methods for calculating gene deletions have been widely explored and contribute to developing gene deletion strategy databases, current approaches are limited in leveraging new data-driven paradigms, such as machine learning, for more efficient strain design. Therefore, it is necessary to propose a fundamental framework for this objective. In this study, we first formulate the problem of gene deletion strategy prediction and then propose a framework for predicting gene deletion strategies for growth-coupled production in genome-scale metabolic models. The proposed framework leverages deep learning algorithms to learn and integrate sequential gene and metabolite data representation, enabling the automatic gene deletion strategy prediction. Computational experiment results demonstrate the feasibility of the proposed framework, showing substantial improvements over the baseline method. Specifically, the proposed framework achieves a 17.64%, 27.15%, and 18.07% increase in overall accuracy across three metabolic models of different scales under study, while maintaining balanced precision and recall in predicting gene deletion statuses. The source code and examples for the framework are publicly available at https://github.com/MetNetComp/DeepGDel.