🤖 AI Summary
Active learning (AL) suffers from limited practical utility in low-data regimes due to high computational overhead and marginal standalone performance gains (only 1–4%). Method: This work repositions AL not as a primary solution for data scarcity, but as a final-stage optimization module—applied *after* data augmentation and semi-supervised learning—to extract residual performance gains. Through systematic ablation studies comparing AL, data augmentation, and semi-supervised learning both independently and in combination, we rigorously evaluate their synergistic effects. Contribution/Results: We provide the first empirical evidence that AL’s marginal utility is strongly contingent on prior data expansion strategies. Our key conceptual contribution is overturning the prevailing “AL-as-first-resort-for-low-data” paradigm, proposing instead a staged data utilization framework. Experiments confirm that integrating AL *after* augmentation and semi-supervised learning yields substantial additional improvements—demonstrating its true value lies in strategic, sequential composition rather than isolated deployment.
📝 Abstract
Even though Active Learning (AL) is widely studied, it is rarely applied in contexts outside its own scientific literature. We posit that the reason for this is AL's high computational cost coupled with the comparatively small lifts it is typically able to generate in scenarios with few labeled points. In this work we study the impact of different methods to combat this low data scenario, namely data augmentation (DA), semi-supervised learning (SSL) and AL. We find that AL is by far the least efficient method of solving the low data problem, generating a lift of only 1-4% over random sampling, while DA and SSL methods can generate up to 60% lift in combination with random sampling. However, when AL is combined with strong DA and SSL techniques, it surprisingly is still able to provide improvements. Based on these results, we frame AL not as a method to combat missing labels, but as the final building block to squeeze the last bits of performance out of data after appropriate DA and SSL methods as been applied.