🤖 AI Summary
Existing methods struggle to accurately identify and align local protein structural motifs—such as active sites—hindering functional interpretation and downstream applications. To address this, we propose PLASMA, the first efficient and interpretable deep learning framework for residue-level protein substructure alignment. PLASMA uniquely formulates the task as a regularized optimal transport problem and employs differentiable Sinkhorn iterations for end-to-end optimization. It integrates geometric feature encoding with attention mechanisms to jointly predict a global similarity score and an interpretable residue-wise alignment matrix. Additionally, we introduce PLASMA-PF, a training-free, lightweight variant. Evaluated across diverse biological benchmarks, PLASMA significantly improves accuracy in local functional site detection while maintaining high computational efficiency, minimal parameter count, and strong interpretability—thereby advancing functional annotation, evolutionary analysis, and structure-guided drug design.
📝 Abstract
Proteins are essential biological macromolecules that execute life functions. Local motifs within protein structures, such as active sites, are the most critical components for linking structure to function and are key to understanding protein evolution and enabling protein engineering. Existing computational methods struggle to identify and compare these local structures, which leaves a significant gap in understanding protein structures and harnessing their functions. This study presents PLASMA, the first deep learning framework for efficient and interpretable residue-level protein substructure alignment. We reformulate the problem as a regularized optimal transport task and leverage differentiable Sinkhorn iterations. For a pair of input protein structures, PLASMA outputs a clear alignment matrix with an interpretable overall similarity score. Through extensive quantitative evaluations and three biological case studies, we demonstrate that PLASMA achieves accurate, lightweight, and interpretable residue-level alignment. Additionally, we introduce PLASMA-PF, a training-free variant that provides a practical alternative when training data are unavailable. Our method addresses a critical gap in protein structure analysis tools and offers new opportunities for functional annotation, evolutionary studies, and structure-based drug design. Reproducibility is ensured via our official implementation at https://github.com/ZW471/PLASMA-Protein-Local-Alignment.git.