🤖 AI Summary
This study investigates the effectiveness of jointly leveraging active learning and transfer learning for cross-domain time-series anomaly detection, addressing high annotation costs and substantial distribution shift between source and target domains. We propose a unified framework that integrates active sample selection—prioritizing high-information unlabeled instances—with knowledge transfer—reusing pre-trained source-domain models—without requiring pre-clustering. Extensive experiments across multiple cross-domain benchmarks demonstrate that: (i) omitting pre-clustering yields superior performance; (ii) active learning gains increase linearly at first but saturate, revealing both efficacy and inherent limitations in sample discrimination; and (iii) performance upper bounds are significantly influenced by sample selection order. Our key contribution is the first quantitative characterization of the marginal returns of this joint strategy and the identification of sample selection ordering as a critical performance bottleneck.
📝 Abstract
This paper examines the effectiveness of combining active learning and transfer learning for anomaly detection in cross-domain time-series data. Our results indicate that there is an interaction between clustering and active learning and in general the best performance is achieved using a single cluster (in other words when clustering is not applied). Also, we find that adding new samples to the training set using active learning does improve model performance but that in general, the rate of improvement is slower than the results reported in the literature suggest. We attribute this difference to an improved experimental design where distinct data samples are used for the sampling and testing pools. Finally, we assess the ceiling performance of transfer learning in combination with active learning across several datasets and find that performance does initially improve but eventually begins to tail off as more target points are selected for inclusion in training. This tail-off in performance may indicate that the active learning process is doing a good job of sequencing data points for selection, pushing the less useful points towards the end of the selection process and that this tail-off occurs when these less useful points are eventually added. Taken together our results indicate that active learning is effective but that the improvement in model performance follows a linear flat function concerning the number of points selected and labelled.