π€ AI Summary
This study addresses the absence of a unified evaluation framework for concept drift detection, where metrics such as classification accuracy are frequently misapplied and fail to faithfully reflect detection performance. For the first time, it systematically links eight categories of drift detection quality metrics to classifier performance through extensive experiments on seven synthetic non-stationary data streams, while explicitly modeling the dynamic characteristics of drift. The work reveals the inherent limitations of classification accuracy in evaluating drift detection and identifies a more informative combination of metrics. These findings provide both empirical evidence and theoretical grounding for establishing a standardized and reliable evaluation methodology in the field of concept drift detection.
π Abstract
Data streams are nowadays among the most frequently analyzed data structures, with the concept drift posing a major challenge encountered by processing systems. Despite the proposition of numerous solutions to counteract the accuracy degeneration due to concept drift, the scientific community has not yet established a unified framework for evaluating the concept drift detection task. Existing research often relies on classification quality metrics, but these can be affected by multiple factors and may not reliably reflect drift detection quality. In this work, we present an in-depth overview of the relationship between metrics for quantifying drift detection quality and classification performance in synthetic nonstationary data streams. The proposed research studies eight drift detection quality metrics in relation to the classifier's performance across seven synthetic data stream generation tools, additionally considering drift dynamics as a factor. The studies aim to identify the most informative set of drift detection quality metrics and provide a deep understanding of the method's evaluation.