Explainable AI in Genomics: Transcription Factor Binding Site Prediction with Mixture of Experts

📅 2025-07-13

📈 Citations: 0

✨ Influential: 0

career value

244K/year

🤖 AI Summary

To address weak generalization, poor interpretability, and degraded out-of-distribution (OOD) performance in transcription factor binding site (TFBS) prediction, this paper proposes an interpretable deep learning framework based on a Mixture of Experts (MoE). The framework integrates multiple pre-trained convolutional neural network (CNN) experts and adaptively fuses their outputs via learned weights, substantially enhancing robustness on both in-distribution and OOD data. Additionally, we introduce ShiftSmooth, a novel attribution mapping method that mitigates the noise sensitivity of conventional gradient-based approaches, enabling high-resolution, stable motif localization and interpretation. Experiments demonstrate that the MoE model outperforms single-model baselines across diverse TFBS datasets, achieving an average 4.2% AUC improvement under OOD conditions. ShiftSmooth significantly surpasses baseline methods in motif detection accuracy and spatial localization consistency, validating its effectiveness and practical utility for deciphering gene regulatory mechanisms.

Technology Category

Application Category

📝 Abstract

Transcription Factor Binding Site (TFBS) prediction is crucial for understanding gene regulation and various biological processes. This study introduces a novel Mixture of Experts (MoE) approach for TFBS prediction, integrating multiple pre-trained Convolutional Neural Network (CNN) models, each specializing in different TFBS patterns. We evaluate the performance of our MoE model against individual expert models on both in-distribution and out-of-distribution (OOD) datasets, using six randomly selected transcription factors (TFs) for OOD testing. Our results demonstrate that the MoE model achieves competitive or superior performance across diverse TF binding sites, particularly excelling in OOD scenarios. The Analysis of Variance (ANOVA) statistical test confirms the significance of these performance differences. Additionally, we introduce ShiftSmooth, a novel attribution mapping technique that provides more robust model interpretability by considering small shifts in input sequences. Through comprehensive explainability analysis, we show that ShiftSmooth offers superior attribution for motif discovery and localization compared to traditional Vanilla Gradient methods. Our work presents an efficient, generalizable, and interpretable solution for TFBS prediction, potentially enabling new discoveries in genome biology and advancing our understanding of transcriptional regulation.

Problem

Research questions and friction points this paper is trying to address.

Predicting Transcription Factor Binding Sites (TFBS) accurately and interpretably

Enhancing model generalizability for out-of-distribution TFBS scenarios

Improving motif discovery via robust attribution mapping techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture of Experts integrates multiple CNN models

ShiftSmooth enhances interpretability with input shifts

ANOVA confirms MoE model performance significance

🔎 Similar Papers

T-Explainer: A Model-Agnostic Explainability Framework Based on Gradients