🤖 AI Summary
This study addresses the challenge of enhancing statistical power while rigorously controlling the false discovery rate (FDR) in multiple hypothesis testing by leveraging auxiliary variables. The authors propose a novel two-stage testing framework that integrates auxiliary information into the FDR control procedure: first, hypotheses are conditionally filtered based on auxiliary data, and then a copula-based model is employed to capture the joint dependence structure between primary and auxiliary variables, enabling dynamic adjustment of testing thresholds. The method is theoretically guaranteed to control the FDR and demonstrates substantially improved power in numerical experiments. When applied to Set4Δ mutant data, it identifies more potentially relevant genes than conventional approaches that rely solely on primary test statistics, highlighting its practical advantages in genomic discovery.
📝 Abstract
In this paper, we present novel methodologies that incorporate auxiliary variables for multiple hypotheses testing related to the main point of interest while effectively controlling the false discovery rate. When dealing with multiple tests concerning the primary variable of interest, researchers can use auxiliary variables to set preconditions for the significance of primary variables, thereby enhancing test efficacy. Depending on the auxiliary variable's role, we propose two approaches: one terminates testing of the primary variable if it does not meet predefined conditions, and the other adjusts the evaluation criteria based on the auxiliary variable. Employing the copula method, we elucidate the dependence between the auxiliary and primary variables by deriving their joint distribution from individual marginal distributions.Our numerical studies, compared with existing methods, demonstrate that the proposed methodologies effectively control the FDR and yield greater statistical power than previous approaches solely based on the primary variable. As an illustrative example, we apply our methods to the Set4$\Delta$ mutant dataset. Our findings highlight the distinctions between our methodologies and traditional approaches, emphasising the potential advantages of our methods in introducing the auxiliary variable for selecting more genes.