🤖 AI Summary
This work addresses the lack of theoretical foundations for feature attribution in multi-output predictive models, particularly the unresolved question of whether Shapley values should be computed independently for each output. By extending classical cooperative game axioms—efficiency, symmetry, dummy player, and additivity—to the vector-valued setting, the paper establishes a rigidity theorem: any attribution rule satisfying these axioms must decompose as a component-wise sum across individual outputs. This result formally justifies the necessity of output-wise SHAP explanations. Empirical validation on biomedical benchmarks demonstrates that this component-wise approach not only preserves interpretability consistency but also substantially improves computational efficiency during both training and deployment of multi-output models.
📝 Abstract
In this article, we provide an axiomatic characterization of feature attribution for multi-output predictors within the Shapley framework. While SHAP explanations are routinely computed independently for each output coordinate, the theoretical necessity of this practice has remained unclear. By extending the classical Shapley axioms to vector-valued cooperative games, we establish a rigidity theorem showing that any attribution rule satisfying efficiency, symmetry, dummy player, and additivity must necessarily decompose component-wise across outputs. Consequently, any joint-output attribution rule must relax at least one of the classical Shapley axioms. This result identifies a previously unformalized structural constraint in Shapley-based interpretability, clarifying the precise scope of fairness-consistent explanations in multi-output learning. Numerical experiments on a biomedical benchmark illustrate that multi-output models can yield computational savings in training and deployment, while producing SHAP explanations that remain fully consistent with the component-wise structure imposed by the Shapley axioms.