🤖 AI Summary
This work addresses the limited out-of-distribution (OOD) generalization of existing protein stability prediction models, along with inconsistencies between forward and reverse mutation predictions and inadequate representation of rare stabilizing mutations. The authors propose a constraint-aware optimization framework that enables end-to-end training without altering the SPURS backbone architecture. By integrating a balanced mean squared error loss, a siamese antisymmetric regularization term, and a novel OOD margin consistency loss, the method effectively combines protein language model embeddings with inverse folding representations. Evaluated across 11 benchmarks—including S669 and S461—the approach achieves substantial performance gains, yielding Spearman correlation coefficients of 0.540 and 0.711, respectively, and demonstrates consistent improvements on multiple OOD datasets.
📝 Abstract
Multimodal $ΔΔG$ predictors integrating protein language models with inverse-folding representations achieve strong in-distribution accuracy on the Megascale dataset but exhibit limited robustness on out-of-distribution (OOD) proteins, persistent forward-reverse bias on paired-mutation benchmarks, and under-representation of rare stabilizing mutations. Existing approaches address these limitations primarily through additional architectural components, leaving optimization-level intervention comparatively underexplored. We introduce a constraint-aware optimization framework combining Balanced Mean Squared Error, a Siamese anti-symmetric regularizer, and a novel OOD-margin consistency loss on the per-position feature representation, requiring no architectural changes to the SPURS backbone. Across eleven benchmarks and three random seeds, the framework improves Spearman correlation on S669 from 0.486 to 0.540 ($σ=0.002$ across seeds), matching the published SPURS baseline (0.50) without architectural modification, and on S461 from 0.653 to 0.711, with consistent smaller gains on five additional OOD datasets. A controlled diagnostic on Ssym reveals that anti-symmetric training does not eliminate systematic forward-reverse bias, indicating that gains arise through implicit regularization rather than exact thermodynamic constraint enforcement.