Sparsity is All You Need: Rethinking Biological Pathway-Informed Approaches in Deep Learning

📅 2025-05-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
It remains unclear whether pathway-guided deep learning models improve performance due to biologically meaningful priors or merely because pathway structures induce beneficial sparsity. Method: We systematically constructed 15 pathway-based models and their rigorously matched random sparse variants—controlling for sparsity level, number of connections, and parameter initialization—and evaluated them across multiple datasets on both predictive performance and interpretability. Contribution/Results: Performance gains are predominantly attributable to sparse regularization rather than biological relevance: three random sparse models significantly outperformed their pathway-based counterparts while retaining accurate disease biomarker identification; pathway models showed no interpretability advantage. This work is the first to mechanistically disentangle the role of biological priors in pathway modeling, introducing a general randomized benchmarking framework. It provides a rigorous methodological foundation for attributing efficacy to domain-specific priors across scientific disciplines.

Technology Category

Application Category

📝 Abstract
Biologically-informed neural networks typically leverage pathway annotations to enhance performance in biomedical applications. We hypothesized that the benefits of pathway integration does not arise from its biological relevance, but rather from the sparsity it introduces. We conducted a comprehensive analysis of all relevant pathway-based neural network models for predictive tasks, critically evaluating each study's contributions. From this review, we curated a subset of methods for which the source code was publicly available. The comparison of the biologically informed state-of-the-art deep learning models and their randomized counterparts showed that models based on randomized information performed equally well as biologically informed ones across different metrics and datasets. Notably, in 3 out of the 15 analyzed models, the randomized versions even outperformed their biologically informed counterparts. Moreover, pathway-informed models did not show any clear advantage in interpretability, as randomized models were still able to identify relevant disease biomarkers despite lacking explicit pathway information. Our findings suggest that pathway annotations may be too noisy or inadequately explored by current methods. Therefore, we propose a methodology that can be applied to different domains and can serve as a robust benchmark for systematically comparing novel pathway-informed models against their randomized counterparts. This approach enables researchers to rigorously determine whether observed performance improvements can be attributed to biological insights.
Problem

Research questions and friction points this paper is trying to address.

Evaluates if biological pathway benefits in neural networks stem from sparsity not biology
Compares pathway-informed models with randomized versions showing similar performance
Proposes a benchmark to test if performance gains are biologically meaningful
Innovation

Methods, ideas, or system contributions that make the work stand out.

Randomized models match biological pathway performance
Sparsity key, not biological relevance, drives success
Benchmark compares pathway vs. randomized models
🔎 Similar Papers
No similar papers found.
I
Isabella Caranzano
Computational Biomedicine Unit, Department of Medical Sciences, University of Torino, Torino, Italy
C
C. Pancotti
Computational Biomedicine Unit, Department of Medical Sciences, University of Torino, Torino, Italy
Cesare Rollo
Cesare Rollo
Post-doctoral Researcher, Department of Medical Sciences, University of Turin
Deep LearningSurvival AnalysisFederated Learning
Flavio Sartori
Flavio Sartori
University of Turin
Deep LearningAI Pathology
P
Pietro Lio
Department of Computer Science and Technology, University of Cambridge, Cambridge, UK
Piero Fariselli
Piero Fariselli
Dept. of Medical Sciences, University of Torino, Italy
BioinformaticsComputational BiophysicsMachine LearningNature Photography
T
T. Sanavia
Computational Biomedicine Unit, Department of Medical Sciences, University of Torino, Torino, Italy