Transformers in Protein: A Survey

📅 2025-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The protein informatics community lacks a systematic review of Transformer applications in protein science. Method: This paper introduces the first comprehensive taxonomy of Transformers tailored for protein science, systematically surveying over 100 studies across five core tasks—structure prediction, functional annotation, protein–protein interaction prediction, drug discovery, and multiscale modeling. It proposes the novel concept of “protein-specialized Transformers” to unify architectural variants, and integrates domain-specific pretraining strategies (e.g., domain adaptation, protein language modeling), benchmark datasets, open-source implementations, and standardized evaluation protocols. Contribution/Results: The work identifies critical challenges—including data scarcity, biological interpretability, and cross-task generalization—and outlines reproducible methodological pathways. It delivers a foundational theoretical framework, practical implementation guidelines, and a curated resource index, thereby advancing deep synergy between AI models and biological problem-solving in protein research.

Technology Category

Application Category

📝 Abstract
As protein informatics advances rapidly, the demand for enhanced predictive accuracy, structural analysis, and functional understanding has intensified. Transformer models, as powerful deep learning architectures, have demonstrated unprecedented potential in addressing diverse challenges across protein research. However, a comprehensive review of Transformer applications in this field remains lacking. This paper bridges this gap by surveying over 100 studies, offering an in-depth analysis of practical implementations and research progress of Transformers in protein-related tasks. Our review systematically covers critical domains, including protein structure prediction, function prediction, protein-protein interaction analysis, functional annotation, and drug discovery/target identification. To contextualize these advancements across various protein domains, we adopt a domain-oriented classification system. We first introduce foundational concepts: the Transformer architecture and attention mechanisms, categorize Transformer variants tailored for protein science, and summarize essential protein knowledge. For each research domain, we outline its objectives and background, critically evaluate prior methods and their limitations, and highlight transformative contributions enabled by Transformer models. We also curate and summarize pivotal datasets and open-source code resources to facilitate reproducibility and benchmarking. Finally, we discuss persistent challenges in applying Transformers to protein informatics and propose future research directions. This review aims to provide a consolidated foundation for the synergistic integration of Transformer and protein informatics, fostering further innovation and expanded applications in the field.
Problem

Research questions and friction points this paper is trying to address.

Surveying Transformer applications in protein research
Addressing gaps in protein structure and function prediction
Providing datasets and resources for reproducibility
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer models enhance protein structure prediction
Attention mechanisms improve protein function analysis
Domain-oriented classification for protein research
🔎 Similar Papers
No similar papers found.
X
Xiaowen Ling
Department of Electronics and Computer Engineering, Shenzhen MSU-BIT University, Shenzhen, China
Zhiqiang Li
Zhiqiang Li
University of Nebraska-Lincoln
Y
Yanbin Wang
Department of Electronics and Computer Engineering, Shenzhen MSU-BIT University, Shenzhen, China
Z
Zhuhong You
School of Computer Science and Technology, Northwestern Polytechnical University, Xi’an, China