A Quasi-Regression Method for the Mediation Analysis of Zero-Inflated Single-Cell Data

📅 2026-04-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing causal mediation analysis methods struggle with the zero-inflated nature of single-cell data and often rely on strong distributional assumptions. This work proposes QuasiMed, a framework for efficient mediation effect inference through a three-step procedure: first, candidate mediators are selected by integrating penalized regression with marginal models; second, indirect effects are estimated using both average expression levels and the proportion of expressing cells; and third, multiple testing correction is applied to control false positives. By modeling only the mean function of the mediator model, QuasiMed avoids stringent distributional assumptions, making it well-suited for zero-inflated single-cell data. Simulations demonstrate that QuasiMed achieves superior performance in statistical power, false discovery rate control, and computational efficiency. The method was successfully applied to ROSMAP single-cell data, uncovering potential causal pathways.
📝 Abstract
Recent advances in single-cell technologies have advanced our understanding of gene regulation and cellular heterogeneity at single-cell resolution. Single-cell data contain both gene expression levels and the proportion of expressing cells, which makes them structurally different from bulk data. Currently, methodological work on causal mediation analysis for single-cell data remains limited and often requires specific distributional assumptions. To address this challenge, we present QuasiMed, a mediation framework specialized for single-cell data. Our proposed method comprises three steps, including (i) screening mediator candidates through penalized regression and marginal models (similar to sure independence screening), (ii) estimation of indirect effects through the average expression and the proportion of expressing cells, (iii) and hypothesis testing with multiplicity control. The key benefit of QuasiMed is that it specifies only the mean functions of the mediation models through a quasi-regression framework, thereby relaxing strict distributional assumptions. The method performance was evaluated through the real-data-inspired simulations, and demonstrated high power, false discovery rate control, and computational efficiency. Lastly, we applied QuasiMed to ROSMAP single-cell data to illustrate its potential to identify mediating causal pathways. R package is freely available on GitHub repository at https://github.com/sjahnn/QuasiMed.
Problem

Research questions and friction points this paper is trying to address.

mediation analysis
single-cell data
zero-inflated
causal inference
distributional assumptions
Innovation

Methods, ideas, or system contributions that make the work stand out.

quasi-regression
mediation analysis
single-cell data
zero-inflated
causal inference
🔎 Similar Papers
No similar papers found.
S
Seungjun Ahn
Department of Population Health Science and Policy, Icahn School of Medicine at Mount Sinai, New York, NY, U.S.A.
D
Donald Porchia
Department of Biostatistics, University of Florida, Gainesville, FL, U.S.A.
Panos Roussos
Panos Roussos
Icahn School of Medicine at Mount Sinai
Disease Neurogenomics
M
Maaike van Gerwen
Department of Otolaryngology - Head and Neck Surgery, Icahn School of Medicine at Mount Sinai, New York, NY, U.S.A.
Qing Lu
Qing Lu
Associate Professor, Division of Biostatistics, Department of Epidemiology and Biostatistics
statistical geneticsbioinformaticsgenetic epidemiology
Z
Zhigang Li
Department of Biostatistics, University of Florida, Gainesville, FL, U.S.A.