🤖 AI Summary
Existing methods lack a unified counterfactual framework for modeling how gene expression in cells within tissue graphs changes under neighborhood perturbations. This work formalizes tissue graph counterfactuals as spatial interventions—perturbations to nodes or edges—and introduces Cellina, a novel framework that disentangles intrinsic cellular states from spatial context via supervised decoupling and generates counterfactual expression predictions conditioned on the spatial context. Integrating graph neural networks with counterfactual reasoning, Cellina substantially outperforms existing models across datasets comprising over 2.5 million cells from colorectal cancer and mouse brain tissues, demonstrating superior performance in perturbation prediction accuracy, disentanglement efficacy, and scalability. Moreover, it enables unsupervised discovery of cancer subregions with clear biological interpretability.
📝 Abstract
\textit{Tissue graph counterfactuals} ask how a cell's expression would change under altered spatial neighbor contexts. Such queries are central to predicting cell behavior in tissues, but lack a unified definition, with existing methods targeting specific intervention types or treating cells as i.i.d. In this work, we first formalize \textit{tissue graph counterfactuals} as a class of spatial interventions that either rewire connections between cells (\textit{edge perturbation}) or modify the expression of their neighbors (\textit{node perturbation}). We then introduce \textit{Cellina} {\renewcommand{\thefootnote}‡\footnote{https://cellina.readthedocs.io}\addtocounter{footnote}{-1}}, a framework that uses supervised disentanglement to decompose a cell's intrinsic state from its spatial context, using the latter as a conditioning input for counterfactual predictions. Across benchmarks spanning over 2.5 million spatially-resolved cells in colorectal cancer and mouse brain, \textit{Cellina} outperforms spatially-informed and non-spatial competitors in tissue perturbations, disentanglement, and scalability. Additionally, we show that \textit{Cellina} reveals biologically distinct cancer subdomains in an unsupervised manner and enables targeted neighbor perturbation simulations.