🤖 AI Summary
This study addresses a critical limitation in clinical prediction: conventional approaches for assessing the importance of high-dimensional features—such as genomic data—rely solely on changes in model performance, thereby neglecting collinearity and causal directionality among variables, which introduces estimation bias. To overcome this, the work introduces asymmetric Shapley values into the domain for the first time, explicitly modeling causal dependencies among variables under causal graph assumptions. It proposes an efficient algorithm to compute both local and global feature contributions and enables decomposition with respect to arbitrary predictive performance metrics. Evaluated on a colorectal cancer progression-free survival prediction task, the method not only enhances the reliability of feature importance estimation but also provides a powerful tool for individualized inference and interpretable analysis.
📝 Abstract
In clinical prediction settings the importance of a high-dimensional feature like genomics is often assessed by evaluating the change in predictive performance when adding it to a set of traditional clinical variables. This approach is questionable, because it does not account for collinearity nor known directionality of dependencies between variables. We suggest to use asymmetric Shapley values as a more suitable alternative to quantify feature importance in the context of a mixed-dimensional prediction model. We focus on a setting that is particularly relevant in clinical prediction: disease state as a mediating variable for genomic effects, with additional confounders for which the direction of effects may be unknown. We derive efficient algorithms to compute local and global asymmetric Shapley values for this setting. The former are shown to be very useful for inference, whereas the latter provide interpretation by decomposing any predictive performance metric into contributions of the features. Throughout, we illustrate our framework by a leading example: the prediction of progression-free survival for colorectal cancer patients.