🤖 AI Summary
This work addresses the challenge of predicting cellular drug responses to unseen drugs and unseen covariate combinations from single-cell RNA sequencing (scRNA-seq) data, with emphasis on high-fidelity, interpretable modeling of perturbation effects at single-cell resolution. We propose the first diffusion-based framework for single-cell drug response prediction. It introduces a non-concatenative GD-Attn attention mechanism that jointly encodes drug identity and dosage as dual conditioning signals. Furthermore, we design a factorized, classifier-free guidance strategy that explicitly models the interaction among pre-perturbation state, drug identity, and dosage within a unified latent space, enabling interpretable mapping from dosage to guidance strength. On the Tahoe-100M benchmark, our method achieves state-of-the-art performance on both unseen-drug and unseen-covariate-combination tasks: DEG gene log-fold-change correlations improve by over 34% relative to the second-best method, significantly enhancing prediction accuracy and preservation of biological specificity.
📝 Abstract
This paper introduces the Single-Cell Perturbation Prediction Diffusion Model (scPPDM), the first diffusion-based framework for single-cell drug-response prediction from scRNA-seq data. scPPDM couples two condition channels, pre-perturbation state and drug with dose, in a unified latent space via non-concatenative GD-Attn. During inference, factorized classifier-free guidance exposes two interpretable controls for state preservation and drug-response strength and maps dose to guidance magnitude for tunable intensity. Evaluated on the Tahoe-100M benchmark under two stringent regimes, unseen covariate combinations (UC) and unseen drugs (UD), scPPDM sets new state-of-the-art results across log fold-change recovery, delta correlations, explained variance, and DE-overlap. Representative gains include +36.11%/+34.21% on DEG logFC-Spearman/Pearson in UD over the second-best model. This control interface enables transparent what-if analyses and dose tuning, reducing experimental burden while preserving biological specificity.