Diffusion Generative Modeling for Spatially Resolved Gene Expression Inference from Histology Images

📅 2025-01-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high cost and low throughput of spatial transcriptomics (ST) technologies, this paper introduces Stem—the first computational framework leveraging conditional diffusion models to infer spatially resolved gene expression profiles from routine hematoxylin and eosin (H&E)-stained images. Stem explicitly models the stochasticity and tissue heterogeneity inherent in ST data via a multi-scale feature encoder, spatially aware noise scheduling, and a cross-platform adaptive training strategy, enabling zero-shot, biologically interpretable gene expression generation. Evaluated across multiple tissues and sequencing platforms, Stem achieves state-of-the-art performance: predicted expression profiles exhibit strong concordance with ground-truth ST data (Pearson *r* > 0.87), faithfully preserve gene-level variability, and—uniquely—enable genome-wide, spatially resolved expression inference directly from clinical H&E images.

Technology Category

Application Category

📝 Abstract
Spatial Transcriptomics (ST) allows a high-resolution measurement of RNA sequence abundance by systematically connecting cell morphology depicted in Hematoxylin and Eosin (H&E) stained histology images to spatially resolved gene expressions. ST is a time-consuming, expensive yet powerful experimental technique that provides new opportunities to understand cancer mechanisms at a fine-grained molecular level, which is critical for uncovering new approaches for disease diagnosis and treatments. Here, we present $ extbf{Stem}$ ($ extbf{S}$pa$ extbf{T}$ially resolved gene $ extbf{E}$xpression inference with diffusion $ extbf{M}$odel), a novel computational tool that leverages a conditional diffusion generative model to enable in silico gene expression inference from H&E stained images. Through better capturing the inherent stochasticity and heterogeneity in ST data, $ extbf{Stem}$ achieves state-of-the-art performance on spatial gene expression prediction and generates biologically meaningful gene profiles for new H&E stained images at test time. We evaluate the proposed algorithm on datasets with various tissue sources and sequencing platforms, where it demonstrates clear improvement over existing approaches. $ extbf{Stem}$ generates high-fidelity gene expression predictions that share similar gene variation levels as ground truth data, suggesting that our method preserves the underlying biological heterogeneity. Our proposed pipeline opens up the possibility of analyzing existing, easily accessible H&E stained histology images from a genomics point of view without physically performing gene expression profiling and empowers potential biological discovery from H&E stained histology images.
Problem

Research questions and friction points this paper is trying to address.

Spatial Transcriptomics
Cellular Imaging
Biodiversity Preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Stem
Diffusion Generative Model
Gene Expression Prediction
🔎 Similar Papers
2024-04-19International Conference on Medical Image Computing and Computer-Assisted InterventionCitations: 7