🤖 AI Summary
To address model drift caused by dynamic evolution of real-world data distributions, this paper proposes an interpretable drift detection method. Unlike existing black-box or weakly interpretable approaches, our method introduces the first risk-aware hypothesis testing framework that explicitly accounts for feature interactions. It unifies drift detection and root-cause interpretation from two perspectives: statistical power and task generality—supporting both classification and regression. By modeling high-order feature interactions, incorporating risk-sensitive test statistics, quantifying interpretability, and localizing drift-sensitive features, the method achieves superior performance over state-of-the-art interpretable methods on multiple benchmark drift datasets and real-world scenarios, while matching the accuracy of leading black-box approaches. Case studies further validate its capability to precisely identify underlying drift mechanisms and critical driving features.
📝 Abstract
Data in the real world often has an evolving distribution. Thus, machine learning models trained on such data get outdated over time. This phenomenon is called model drift. Knowledge of this drift serves two purposes: (i) Retain an accurate model and (ii) Discovery of knowledge or insights about change in the relationship between input features and output variable w.r.t. the model. Most existing works focus only on detecting model drift but offer no interpretability. In this work, we take a principled approach to study the problem of interpretable model drift detection from a risk perspective using a feature-interaction aware hypothesis testing framework, which enjoys guarantees on test power. The proposed framework is generic, i.e., it can be adapted to both classification and regression tasks. Experiments on several standard drift detection datasets show that our method is superior to existing interpretable methods (especially on real-world datasets) and on par with state-of-the-art black-box drift detection methods. We also quantitatively and qualitatively study the interpretability aspect including a case study on USENET2 dataset. We find our method focuses on model and drift sensitive features compared to baseline interpretable drift detectors.