🤖 AI Summary
Existing antimicrobial peptide (AMP) toxicity prediction models only perform binary classification of hemolytic concentration (HC₅₀) as “toxic” or “non-toxic”, lacking quantitative regression capability for actual HC₅₀ values. To address this, we propose the first interpretable deep learning model that predicts HC₅₀ directly from amino acid sequences alone. Our method integrates residue-level embeddings from ProtT5 and ESM2 with handcrafted sequence descriptors, employs a dual-branch local–global architecture enhanced by cross-attention, and adopts log-cosh loss for robust regression. Interpretability is achieved via gradient-weighted class activation mapping (Grad-CAM) to identify hemolysis-driving residues. On an independent test set, the model achieves a Pearson correlation coefficient of 0.756 and MSE of 0.987—substantially outperforming state-of-the-art baselines. Ablation studies validate the efficacy of each architectural component. This work enables end-to-end quantitative assessment and mechanistic interpretation of AMP hemolytic potency, advancing rational design of safe and effective antimicrobial peptides.
📝 Abstract
Red-blood-cell lysis (HC50) is the principal safety barrier for antimicrobial-peptide (AMP) therapeutics, yet existing models only say "toxic" or "non-toxic." AmpLyze closes this gap by predicting the actual HC50 value from sequence alone and explaining the residues that drive toxicity. The model couples residue-level ProtT5/ESM2 embeddings with sequence-level descriptors in dual local and global branches, aligned by a cross-attention module and trained with log-cosh loss for robustness to assay noise. The optimal AmpLyze model reaches a PCC of 0.756 and an MSE of 0.987, outperforming classical regressors and the state-of-the-art. Ablations confirm that both branches are essential, and cross-attention adds a further 1% PCC and 3% MSE improvement. Expected-Gradients attributions reveal known toxicity hotspots and suggest safer substitutions. By turning hemolysis assessment into a quantitative, sequence-based, and interpretable prediction, AmpLyze facilitates AMP design and offers a practical tool for early-stage toxicity screening.