An Empirical Evaluation of Factors Affecting SHAP Explanation of Time Series Classification

📅 2025-09-03

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Existing time-series segmentation strategies for SHAP-based explanations lack theoretical grounding, leading to suboptimal trade-offs between interpretability quality and computational efficiency. Method: We systematically evaluate eight segmentation algorithms and identify segment count—not algorithm choice—as the dominant factor affecting explanation fidelity and speed; equal-length segmentation achieves high explanation quality with significantly reduced computational overhead. We further propose a length-weighted normalization technique to mitigate attribution bias arising from heterogeneous segment lengths. Contribution/Results: Evaluated across diverse models (e.g., ResNet, InceptionTime) and datasets using two XAI metrics—InterpretTime and AUC Difference—our approach consistently improves SHAP explanation accuracy on both univariate and multivariate time series. It establishes a reproducible, efficient, and robust practical paradigm for time-series explainability, addressing critical gaps in segmentation design and attribution calibration.

Technology Category

Application Category

📝 Abstract

Explainable AI (XAI) has become an increasingly important topic for understanding and attributing the predictions made by complex Time Series Classification (TSC) models. Among attribution methods, SHapley Additive exPlanations (SHAP) is widely regarded as an excellent attribution method; but its computational complexity, which scales exponentially with the number of features, limits its practicality for long time series. To address this, recent studies have shown that aggregating features via segmentation, to compute a single attribution value for a group of consecutive time points, drastically reduces SHAP running time. However, the choice of the optimal segmentation strategy remains an open question. In this work, we investigated eight different Time Series Segmentation algorithms to understand how segment compositions affect the explanation quality. We evaluate these approaches using two established XAI evaluation methodologies: InterpretTime and AUC Difference. Through experiments on both Multivariate (MTS) and Univariate Time Series (UTS), we find that the number of segments has a greater impact on explanation quality than the specific segmentation method. Notably, equal-length segmentation consistently outperforms most of the custom time series segmentation algorithms. Furthermore, we introduce a novel attribution normalisation technique that weights segments by their length and we show that it consistently improves attribution quality.

Problem

Research questions and friction points this paper is trying to address.

Evaluating segmentation strategies for SHAP in time series classification

Assessing impact of segment number on explanation quality

Improving attribution quality with length-weighted normalization technique

Innovation

Methods, ideas, or system contributions that make the work stand out.

Equal-length segmentation outperforms custom algorithms

Novel length-weighted attribution normalization improves quality

Segment count impacts explanation quality more than method

🔎 Similar Papers

Improving the Weighting Strategy in KernelSHAP