🤖 AI Summary
How can single-cell perturbation models generalize to unseen treatment conditions (e.g., novel drugs or dosages) while capturing cell-to-cell heterogeneity in perturbation responses? This paper introduces a Conditional Optimal Transport (COT) framework that, for the first time, formally defines and optimizes the Conditional Monge Gap to learn Monge maps conditioned on arbitrary covariates—including drug identity, dosage, time, and cell type. By jointly training across multiple tasks and aggregating data across perturbation conditions, COT significantly improves generalization to unseen interventions. Furthermore, it integrates scRNA-seq and multiplexed protein imaging data to jointly model structural and functional drug representations. On diverse single-cell modalities, COT matches or surpasses task-specific state-of-the-art methods. Crucially, it achieves substantially higher prediction accuracy for unseen drugs compared to existing conditional models—particularly excelling at capturing higher-order heterogeneity in perturbation responses.
📝 Abstract
Learning the response of single-cells to various treatments offers great potential to enable targeted therapies. In this context, neural optimal transport (OT) has emerged as a principled methodological framework because it inherently accommodates the challenges of unpaired data induced by cell destruction during data acquisition. However, most existing OT approaches are incapable of conditioning on different treatment contexts (e.g., time, drug treatment, drug dosage, or cell type) and we still lack methods that unanimously show promising generalization performance to unseen treatments. Here, we propose the Conditional Monge Gap which learns OT maps conditionally on arbitrary covariates. We demonstrate its value in predicting single-cell perturbation responses conditional to one or multiple drugs, a drug dosage, or combinations thereof. We find that our conditional models achieve results comparable and sometimes even superior to the condition-specific state-of-the-art on scRNA-seq as well as multiplexed protein imaging data. Notably, by aggregating data across conditions we perform cross-task learning which unlocks remarkable generalization abilities to unseen drugs or drug dosages, widely outperforming other conditional models in capturing heterogeneity (i.e., higher moments) in the perturbed population. Finally, by scaling to hundreds of conditions and testing on unseen drugs, we narrow the gap between structure-based and effect-based drug representations, suggesting a promising path to the successful prediction of perturbation effects for unseen treatments.