Genotype-Conditioned Molecular Generation via Evidence-Grounded Multi-Objective Latent Perturbation in Diffusion Models

📅 2026-05-31

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This study addresses the challenges posed by tumor heterogeneity and the scarcity of actionable molecular targets by proposing a genotype-informed approach to personalized anticancer drug generation. The method introduces learnable perturbations in the latent space of a pretrained genotype-to-drug diffusion model, optimizing a multi-objective reward function that integrates drug sensitivity (AUC), drug-likeness (QED), and synthetic accessibility (SAS) via gradient ascent. Innovatively, the reward design incorporates pharmacological signals and experimentally validated data from cancer cell lines, while an attention-guided multi-agent LLM verification mechanism ensures biological plausibility of the generated molecules. Evaluated across 15 cancer cell lines, the proposed approach consistently outperforms existing baselines, achieving significant improvements in sensitivity, drug-likeness, synthetic accessibility, and chemical validity.

📝 Abstract

Developing effective anticancer therapeutics remains challenging due to tumor heterogeneity and the absence of well-defined molecular targets across cancer subtypes. Generative models conditioned on cancer genotypes offer a promising avenue for personalized drug discovery, yet existing approaches lack explicit optimization for simultaneous sensitivity, synthesizability, and mechanistic binding plausibility. We present a latent-space optimization approach for a pretrained genotype-to-drug diffusion model, introducing a learnable perturbation over the molecular latent space optimized via gradient ascent to maximize a composite reward combining predicted drug sensitivity (AUC), drug-likeness (QED), and synthetic accessibility (SAS). Critically, biological realism is enforced by grounding both reward design and evaluation in experimentally-derived cancer cell line data and validated pharmacologic signals, anchoring candidate generation in real-world clinical evidence. Mechanistic consistency plausibility is further assessed by a multi-agent LLM pipeline grounded in the diffusion model's attention mechanism. Experiments across 15 cancer cell lines from three held-out evaluation sets demonstrate consistent and noticeable improvements over competing baselines in sensitivity, drug-likeness, synthesizability, and chemical validity.

Problem

Research questions and friction points this paper is trying to address.

molecular generation

cancer heterogeneity

personalized drug discovery

multi-objective optimization

genotype-conditioned

Innovation

Methods, ideas, or system contributions that make the work stand out.

latent-space optimization

multi-objective reward

evidence-grounded generation