🤖 AI Summary
Data-driven subgroup identification in clinical trials suffers from post-selection inference issues, leading to inflated Type I error rates and biased effect estimates—hindering the implementation of precision medicine. To address the dual objective of identifying both *safe subgroups* (with low adverse event risk) and *efficacious subgroups* (with high treatment effect), this paper proposes two novel controlled subgroup selection methods: one based on generalized linear models and another within an isotonic regression framework. For the first time in a regression setting, both methods enable rigorous post-selection inference with guaranteed Type I error control under the null. Comprehensive simulation studies demonstrate robust error rate control across diverse scenarios and quantify sensitivity to modeling assumptions. The proposed methods provide a statistically rigorous, reproducible, and practically applicable toolkit for clinical subgroup analysis.
📝 Abstract
Subgroup selection in clinical trials is essential for identifying patient groups that react differently to a treatment, thereby enabling personalised medicine. In particular, subgroup selection can identify patient groups that respond particularly well to a treatment or that encounter adverse events more often. However, this is a post-selection inference problem, which may pose challenges for traditional techniques used for subgroup analysis, such as increased Type I error rates and potential biases from data-driven subgroup identification. In this paper, we present two methods for subgroup selection in regression problems: one based on generalised linear modelling and another on isotonic regression. We demonstrate how these methods can be used for data-driven subgroup identification in the analysis of clinical trials, focusing on two distinct tasks: identifying patient groups that are safe from manifesting adverse events and identifying patient groups with high treatment effect, while controlling for Type I error in both cases. A thorough simulation study is conducted to evaluate the strengths and weaknesses of each method, providing detailed insight into the sensitivity of the Type I error rate control to modelling assumptions.