🤖 AI Summary
This paper addresses the problem of testing multivariate conditional independence between a response variable (Y) and high-dimensional covariates (X) given confounders (Z). We propose the Multivariate Sufficient-statistic-based Conditional Randomization Test (MS-CRT), which constructs conditional exchangeability by leveraging sufficient statistics of (P(X mid Z)), bypassing explicit modeling of (Y) and accommodating arbitrary test statistics. Key contributions include: (i) the first integration of sufficient statistics into the CRT framework; (ii) overcoming the curse of dimensionality in (P(X mid Z)) by reducing dependence on its parametric dimension; (iii) enabling group selection and false discovery rate (FDR) control; and (iv) establishing minimax-optimal detection rates under multivariate normality. Extensive simulations and real-data analyses demonstrate that MS-CRT substantially improves joint signal detection power—particularly when individual components of (X) exert weak effects on (Y)—and consistently outperforms state-of-the-art methods in graphical model learning tasks.
📝 Abstract
We consider testing multivariate conditional independence between a response Y and a covariate vector X given additional variables Z. We introduce the Multivariate Sufficient Statistic Conditional Randomization Test (MS-CRT), which generates exchangeable copies of X by conditioning on sufficient statistics of P(X|Z). MS-CRT requires no modelling assumption on Y and accommodates any test statistics, including those derived from complex predictive models. It relaxes the assumptions of standard conditional randomization tests by allowing more unknown parameters in P(X|Z) than the sample size. MS-CRT avoids multiplicity corrections and effectively detects joint signals, even when individual components of X have only weak effects on Y . Our method extends to group selection with false discovery rate control. We develop efficient implementations for two important cases where P(X,Z) is either multivariate normal or belongs to a graphical model. For normal models, we establish the minimax rate optimality. For graphical models, we demonstrate the superior performance of our method compared to existing methods through comprehensive simulations and real-data examples.