🤖 AI Summary
Existing protein structure reliability metrics (e.g., pLDDT) emphasize energy-based stability but fail to detect subtle errors—such as atomic clashes and conformational traps—arising from topological frustration in the energy landscape. To address this, we propose CONFIDE, the first framework to quantify topological frustration in an unsupervised manner by leveraging latent embeddings from the AlphaFold3 diffusion model, yielding the topology-aware metric CODE. CONFIDE then integrates CODE with pLDDT into a unified, dual-dimensional (energy + topology) reliability score. Experiments demonstrate that CODE achieves a Spearman correlation of 0.82 with experimental protein folding rates—a 148% relative improvement over prior metrics. CONFIDE attains a Spearman correlation of 0.73 with RMSD in molecular glue prediction, representing a 73.8% gain over state-of-the-art methods. Moreover, CONFIDE consistently outperforms existing approaches across diverse drug design tasks, including binder design and interface prediction.
📝 Abstract
Reliable evaluation of protein structure predictions remains challenging, as metrics like pLDDT capture energetic stability but often miss subtle errors such as atomic clashes or conformational traps reflecting topological frustration within the protein folding energy landscape. We present CODE (Chain of Diffusion Embeddings), a self evaluating metric empirically found to quantify topological frustration directly from the latent diffusion embeddings of the AlphaFold3 series of structure predictors in a fully unsupervised manner. Integrating this with pLDDT, we propose CONFIDE, a unified evaluation framework that combines energetic and topological perspectives to improve the reliability of AlphaFold3 and related models. CODE strongly correlates with protein folding rates driven by topological frustration, achieving a correlation of 0.82 compared to pLDDT's 0.33 (a relative improvement of 148%). CONFIDE significantly enhances the reliability of quality evaluation in molecular glue structure prediction benchmarks, achieving a Spearman correlation of 0.73 with RMSD, compared to pLDDT's correlation of 0.42, a relative improvement of 73.8%. Beyond quality assessment, our approach applies to diverse drug design tasks, including all-atom binder design, enzymatic active site mapping, mutation induced binding affinity prediction, nucleic acid aptamer screening, and flexible protein modeling. By combining data driven embeddings with theoretical insight, CODE and CONFIDE outperform existing metrics across a wide range of biomolecular systems, offering robust and versatile tools to refine structure predictions, advance structural biology, and accelerate drug discovery.