🤖 AI Summary
Sample size calculation for randomized controlled trials (RCTs) with skewed continuous or ordinal outcomes remains challenging, particularly under individual or cluster-randomized designs.
Method: We propose a unified, robust approach that extends Whitehead’s rank test formula to continuous outcomes and introduces a novel rank-based intracluster correlation coefficient to adjust for the design effect—thereby accommodating both individual and cluster randomization. The method is grounded in the ordinal cumulative probability model and Wilcoxon rank-sum theory, requiring no distributional assumptions or data transformations.
Contribution/Results: Monte Carlo simulations demonstrate stable and substantially improved statistical power across diverse realistic settings—including skewed continuous outcomes, cluster-randomized non-inferiority trials, and irregular count data—outperforming conventional t-tests and existing rank-based methods. This work establishes a general, simple, and assumption-free paradigm for sample size determination in RCTs with complex outcomes.
📝 Abstract
Sample size calculations can be challenging with skewed continuous outcomes in randomized controlled trials (RCTs). Standard t-test-based calculations may require data transformation, which may be difficult before data collection. Calculations based on individual and clustered Wilcoxon rank-sum tests have been proposed as alternatives, but these calculations assume no ties in continuous outcomes, and clustered Wilcoxon rank-sum tests perform poorly with heterogeneous cluster sizes. Recent work has shown that continuous outcomes can be analyzed in a robust manner using ordinal cumulative probability models. Analogously, sample size calculations for ordinal outcomes can be applied as a robust design strategy for continuous outcomes. We show that Whitehead's sample size calculations for independent ordinal outcomes can be easily extended to continuous outcomes. We extend these calculations to cluster RCTs using a design effect incorporating the rank intraclass correlation coefficient. Therefore, we provide a unifying and simple approach for designing individual and cluster RCTs that makes minimal assumptions on the distribution of the still-to-be-collected outcome. We conduct simulations to evaluate our approach's performance and illustrate its application in multiple RCTs: an individual RCT with skewed continuous outcomes, a cluster RCT with skewed continuous outcomes, and a non-inferiority cluster RCT with an irregularly distributed count outcome.