Fast and Interpretable Protein Substructure Alignment via Optimal Transport

📅 2025-10-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing methods struggle to accurately identify and align local protein structural motifs—such as active sites—hindering functional interpretation and downstream applications. To address this, we propose PLASMA, the first efficient and interpretable deep learning framework for residue-level protein substructure alignment. PLASMA uniquely formulates the task as a regularized optimal transport problem and employs differentiable Sinkhorn iterations for end-to-end optimization. It integrates geometric feature encoding with attention mechanisms to jointly predict a global similarity score and an interpretable residue-wise alignment matrix. Additionally, we introduce PLASMA-PF, a training-free, lightweight variant. Evaluated across diverse biological benchmarks, PLASMA significantly improves accuracy in local functional site detection while maintaining high computational efficiency, minimal parameter count, and strong interpretability—thereby advancing functional annotation, evolutionary analysis, and structure-guided drug design.

Technology Category

Application Category

📝 Abstract
Proteins are essential biological macromolecules that execute life functions. Local motifs within protein structures, such as active sites, are the most critical components for linking structure to function and are key to understanding protein evolution and enabling protein engineering. Existing computational methods struggle to identify and compare these local structures, which leaves a significant gap in understanding protein structures and harnessing their functions. This study presents PLASMA, the first deep learning framework for efficient and interpretable residue-level protein substructure alignment. We reformulate the problem as a regularized optimal transport task and leverage differentiable Sinkhorn iterations. For a pair of input protein structures, PLASMA outputs a clear alignment matrix with an interpretable overall similarity score. Through extensive quantitative evaluations and three biological case studies, we demonstrate that PLASMA achieves accurate, lightweight, and interpretable residue-level alignment. Additionally, we introduce PLASMA-PF, a training-free variant that provides a practical alternative when training data are unavailable. Our method addresses a critical gap in protein structure analysis tools and offers new opportunities for functional annotation, evolutionary studies, and structure-based drug design. Reproducibility is ensured via our official implementation at https://github.com/ZW471/PLASMA-Protein-Local-Alignment.git.
Problem

Research questions and friction points this paper is trying to address.

Identifies and compares local protein structural motifs
Reformulates alignment as regularized optimal transport task
Provides interpretable residue-level protein substructure alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep learning framework for protein substructure alignment
Reformulates alignment as regularized optimal transport task
Uses differentiable Sinkhorn iterations for interpretable results
🔎 Similar Papers
No similar papers found.
Z
Zhiyu Wang
Shanghai Jiao Tong University
Bingxin Zhou
Bingxin Zhou
Shanghai Jiao Tong University
Graph Neural NetworksProtein Representation LearningAI4Biology
J
Jing Wang
Shanghai Jiao Tong University
Y
Yang Tan
Shanghai Jiao Tong University
W
Weishu Zhao
Shanghai Jiao Tong University
Pietro Liò
Pietro Liò
Professor, University of Cambridge
AI & Comp Biology -> Medicine
L
Liang Hong
Shanghai Jiao Tong University