ProDCARL: Reinforcement Learning-Aligned Diffusion Models for De Novo Antimicrobial Peptide Design

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

195K/year
🤖 AI Summary
This work addresses the challenge of explicitly optimizing both high antimicrobial activity and low toxicity in antimicrobial peptide (AMP) design—a limitation of conventional generative models—by introducing, for the first time, a reinforcement learning alignment mechanism into a diffusion model. Building upon the EvoDiff OA-DM 38M architecture, the proposed framework employs AMP activity and toxicity predictors to guide sequence generation, integrating top-k policy gradients, entropy regularization, and early stopping to enhance performance while preserving sequence diversity. Experimental results demonstrate a significant improvement in generated peptides, with the average AMP score increasing from 0.081 to 0.178. Notably, 6.3% of candidates exhibit high quality (pAMP > 0.7 and pTox < 0.3), and sequence diversity remains high at 0.929, effectively balancing potency, safety, and novelty.

Technology Category

Application Category

📝 Abstract
Antimicrobial resistance threatens healthcare sustainability and motivates low-cost computational discovery of antimicrobial peptides (AMPs). De novo peptide generation must optimize antimicrobial activity and safety through low predicted toxicity, but likelihood-trained generators do not enforce these goals explicitly. We introduce ProDCARL, a reinforcement-learning alignment framework that couples a diffusion-based protein generator (EvoDiff OA-DM 38M) with sequence property predictors for AMP activity and peptide toxicity. We fine-tune the diffusion prior on AMP sequences to obtain a domain-aware generator. Top-k policy-gradient updates use classifier-derived rewards plus entropy regularization and early stopping to preserve diversity and reduce reward hacking. In silico experiments show ProDCARL increases the mean predicted AMP score from 0.081 after fine-tuning to 0.178. The joint high-quality hit rate reaches 6.3\% with pAMP $>$0.7 and pTox $<$0.3. ProDCARL maintains high diversity, with $1-$mean pairwise identity equal to 0.929. Qualitative analyses with AlphaFold3 and ProtBERT embeddings suggest candidates show plausible AMP-like structural and semantic characteristics. ProDCARL serves as a candidate generator that narrows experimental search space, and experimental validation remains future work.
Problem

Research questions and friction points this paper is trying to address.

antimicrobial peptides
de novo design
toxicity prediction
antimicrobial activity
computational discovery
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion models
reinforcement learning alignment
de novo peptide design
antimicrobial peptides
reward hacking mitigation
💼 Related Jobs
Postdoctoral Fellow – AI-Driven Multi-Omics Integration for Predictive Toxicology
Pfizer
The annual base salary for this position ranges from $64,600.00 to $107,600.00. In addition, this position is eligible for participation in Pfizer’s Global Performance Plan with a bonus target of 7.5% of the base salary. We offer comprehensive and generous benefits and programs to help our colleagues lead healthy lives and to support each of life’s moments. Benefits offered include a 401(k) plan with Pfizer Matching Contributions and an additional Pfizer Retirement Savings Contribution, paid vacation, holiday and personal days, paid caregiver/parental and medical leave, and health benefits to include medical, prescription drug, dental and vision coverage. Learn more at Pfizer Candidate Site – U.S. Benefits | (uscandidates.mypfizerbenefits.com). Pfizer compensation structures and benefit packages are aligned based on the location of hire. The United States salary range provided does not apply to Tampa, FL or any location outside of the United States. Relocation assistance may be available based on business needs and/or eligibility.
Hybrid