Robust Gene Prioritization via Fast-mRMR Feature Selection in high-dimensional omics data

📅 2025-11-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Gene prioritization on high-dimensional, sparse, and partially labeled omics data suffers from poor robustness. Method: We propose an end-to-end framework integrating Fast-mRMR feature selection with a downstream classifier. It achieves efficient dimensionality reduction by maximizing feature–target relevance while minimizing inter-feature redundancy, thereby preserving biologically salient signals and enhancing model interpretability and generalizability. The framework supports fusion of heterogeneous, multi-source features to construct a parsimonious and robust ranking model. Results: Evaluated on real-world omics datasets—including dietary restriction studies—our method significantly outperforms state-of-the-art baselines in both ranking accuracy and stability. These results underscore the critical role of robust feature selection in enabling reliable gene functional inference.

Technology Category

Application Category

📝 Abstract
Gene prioritization (identifying genes potentially associated with a biological process) is increasingly tackled with Artificial Intelligence. However, existing methods struggle with the high dimensionality and incomplete labelling of biomedical data. This work proposes a more robust and efficient pipeline that leverages Fast-mRMR feature selection to retain only relevant, non-redundant features for classifiers. This enables us to build simpler and more effective models, as well as to combine different biological feature sets. Experiments on Dietary Restriction datasets show significant improvements over existing methods, proving that feature selection can be critical for reliable gene prioritization.
Problem

Research questions and friction points this paper is trying to address.

Addressing high dimensionality in biomedical data for gene prioritization
Overcoming incomplete labeling challenges in biological datasets
Improving robustness and efficiency of AI-driven gene identification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fast-mRMR feature selection for high-dimensional omics data
Retains relevant non-redundant features for classifiers
Enables simpler models and biological feature combination