🤖 AI Summary
Dropout events in single-cell RNA sequencing (scRNA-seq) data severely distort genuine biological signals. To address this, we propose a two-stage hybrid computational framework: first, scRecover is employed to precisely identify dropout events based on mechanistic modeling; second, missForest—a nonparametric, iterative, nonlinear imputation method rooted in random forests—is applied to impute the identified missing values. Our approach innovatively integrates mechanism-driven dropout detection with data-driven, nonparametric imputation, thereby preserving biological fidelity while enhancing model interpretability and algorithmic transparency. Systematic evaluation across multiple public and simulated scRNA-seq datasets demonstrates that our method achieves imputation accuracy comparable to or superior than state-of-the-art tools. Moreover, it exhibits moderate computational efficiency, making it well-suited for medium-scale scRNA-seq datasets. Overall, the framework strikes an effective balance between accuracy and scalability.
📝 Abstract
Single-cell RNA sequencing (scRNA-seq) enables transcriptomic profiling at cellular resolution but suffers from pervasive dropout events that obscure biological signals. We present SCR-MF, a modular two-stage workflow that combines principled dropout detection using scRecover with robust non-parametric imputation via missForest. Across public and simulated datasets, SCR-MF achieves robust and interpretable performance comparable to or exceeding existing imputation methods in most cases, while preserving biological fidelity and transparency. Runtime analysis demonstrates that SCR-MF provides a competitive balance between accuracy and computational efficiency, making it suitable for mid-scale single-cell datasets.