Auxiliary Gene Learning: Spatial Gene Expression Estimation by Auxiliary Gene Selection

📅 2025-11-23

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Spatial transcriptomics (ST) data suffer from high observational noise, and existing methods train models solely on highly variable genes, neglecting lowly expressed yet co-expressed auxiliary genes—thereby limiting the accuracy of target gene expression estimation. To address this, we propose a bi-level optimization framework that integrates auxiliary gene learning: low-expression genes are modeled as auxiliary tasks, and a differentiable top-k selection mechanism—incorporating gene co-expression priors—is designed to jointly optimize primary task performance and auxiliary gene selection. This approach enables adaptive identification and weighted utilization of informative auxiliary genes while preserving computational differentiability. Experiments demonstrate that our method significantly improves target gene expression reconstruction accuracy over conventional multi-task learning, achieving state-of-the-art performance across multiple ST datasets.

Technology Category

Application Category

📝 Abstract

Spatial transcriptomics (ST) is a novel technology that enables the observation of gene expression at the resolution of individual spots within pathological tissues. ST quantifies the expression of tens of thousands of genes in a tissue section; however, heavy observational noise is often introduced during measurement. In prior studies, to ensure meaningful assessment, both training and evaluation have been restricted to only a small subset of highly variable genes, and genes outside this subset have also been excluded from the training process. However, since there are likely co-expression relationships between genes, low-expression genes may still contribute to the estimation of the evaluation target. In this paper, we propose $Auxiliary Gene Learning$ (AGL) that utilizes the benefit of the ignored genes by reformulating their expression estimation as auxiliary tasks and training them jointly with the primary tasks. To effectively leverage auxiliary genes, we must select a subset of auxiliary genes that positively influence the prediction of the target genes. However, this is a challenging optimization problem due to the vast number of possible combinations. To overcome this challenge, we propose Prior-Knowledge-Based Differentiable Top-$k$ Gene Selection via Bi-level Optimization (DkGSB), a method that ranks genes by leveraging prior knowledge and relaxes the combinatorial selection problem into a differentiable top-$k$ selection problem. The experiments confirm the effectiveness of incorporating auxiliary genes and show that the proposed method outperforms conventional auxiliary task learning approaches.

Problem

Research questions and friction points this paper is trying to address.

Estimating spatial gene expression with noisy measurements

Selecting auxiliary genes to improve target gene prediction

Reformulating gene estimation as joint primary and auxiliary tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Auxiliary Gene Learning for spatial expression estimation

Differentiable top-k gene selection via bi-level optimization

Joint training of primary and auxiliary gene tasks

🔎 Similar Papers

No similar papers found.