🤖 AI Summary
This paper addresses inference bias in single-cell CRISPR screens arising from noisy surrogate variables used to assess perturbation efficacy. We propose a maximum-likelihood-based adaptive weighted hypothesis testing framework that dynamically estimates Pitman’s relative efficiency from positive-control outcomes to quantify surrogate quality and assign optimal weights—overcoming performance limitations of conventional fixed-weight or unweighted strategies. Theoretical analysis and numerical simulations jointly establish that Pitman efficiency is the key determinant of weighted test power gain. Our method substantially improves statistical power under low-to-moderate surrogate quality and exhibits superior robustness compared to existing approaches. This work establishes a novel, interpretable, and computationally tractable paradigm for inference on latent binary variables in the presence of noisy surrogates.
📝 Abstract
We investigate inference in a latent binary variable model where a noisy proxy of the latent variable is available, motivated by the variable perturbation effectiveness problem in single-cell CRISPR screens. The baseline approach is to ignore the perturbation effectiveness problem, while a recent proposal employs a weighted average based on the proxies. Our main goals are to determine how accurate the proxies must be in order for a weighted test to gain power over the unweighted baseline, and to develop tests that are powerful regardless of the accuracy of the proxies. To address the first goal, we compute the Pitman relative efficiency of the weighted test relative to the unweighted test, yielding an interpretable quantification of proxy quality that drives the power of the weighted test. To address the second goal, we propose two strategies. First, we propose a maximum-likelihood based approach that adapts the proxies to the data. Second, we propose an estimator of the Pitman efficiency if a "positive control outcome variable" is available (as is often the case in single-cell CRISPR screens), which facilitates an adaptive choice of whether to use the proxies at all. Our numerical simulations support the Pitman efficiency as the key quantity for determining whether the weighted test gains power over the baseline, and demonstrate that the two proposed adaptive tests can improve on both existing approaches across a range of proxy qualities.