A Hybrid Computational Intelligence Framework for scRNA-seq Imputation: Integrating scRecover and Random Forests

📅 2025-11-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Dropout events in single-cell RNA sequencing (scRNA-seq) data severely distort genuine biological signals. To address this, we propose a two-stage hybrid computational framework: first, scRecover is employed to precisely identify dropout events based on mechanistic modeling; second, missForest—a nonparametric, iterative, nonlinear imputation method rooted in random forests—is applied to impute the identified missing values. Our approach innovatively integrates mechanism-driven dropout detection with data-driven, nonparametric imputation, thereby preserving biological fidelity while enhancing model interpretability and algorithmic transparency. Systematic evaluation across multiple public and simulated scRNA-seq datasets demonstrates that our method achieves imputation accuracy comparable to or superior than state-of-the-art tools. Moreover, it exhibits moderate computational efficiency, making it well-suited for medium-scale scRNA-seq datasets. Overall, the framework strikes an effective balance between accuracy and scalability.

Technology Category

Application Category

📝 Abstract
Single-cell RNA sequencing (scRNA-seq) enables transcriptomic profiling at cellular resolution but suffers from pervasive dropout events that obscure biological signals. We present SCR-MF, a modular two-stage workflow that combines principled dropout detection using scRecover with robust non-parametric imputation via missForest. Across public and simulated datasets, SCR-MF achieves robust and interpretable performance comparable to or exceeding existing imputation methods in most cases, while preserving biological fidelity and transparency. Runtime analysis demonstrates that SCR-MF provides a competitive balance between accuracy and computational efficiency, making it suitable for mid-scale single-cell datasets.
Problem

Research questions and friction points this paper is trying to address.

Addresses pervasive dropout events in scRNA-seq data
Combines dropout detection with non-parametric imputation methods
Balances computational efficiency with biological fidelity preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid framework combines scRecover and Random Forests
Two-stage workflow detects dropouts then imputes data
Balances accuracy with computational efficiency for datasets
🔎 Similar Papers
No similar papers found.
A
Ali Anaissi
University of Technology Sydney, Australia
D
Deshao Liu
Asia Pacific International College (APIC), Parramatta, NSW, Australia
Y
Yuanzhe Jia
University of Sydney, Australia
Weidong Huang
Weidong Huang
Beijing Institute for General Artificial Intelligence
HumanoidWorld ModelsReinforcement Learning
W
Widad Alyassine
University of Sydney, Australia
J
Junaid Akram
University of Sydney, Australia