Active Learning of Deep Neural Networks via Gradient-Free Cutting Planes

📅 2024-10-03
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the low sample efficiency and lack of theoretical guarantees in deep active learning. We propose the first gradient-free cutting-plane training framework for arbitrary deep ReLU networks. Our method abandons gradient-based optimization by explicitly modeling the piecewise-linear structure of ReLU networks, thereby extending the classical cutting-plane method to non-convex, non-linear settings, and designing an active querying strategy based on geometric contraction of the feasible set. Theoretically, we establish the first convergence guarantee for deep active learning, characterizing the feasible-set contraction rate and overcoming the long-standing limitation that cutting-plane methods apply only to convex models. Empirically, our approach significantly outperforms mainstream baselines on both synthetic benchmarks and real-world sentiment classification tasks, validating both theoretical convergence and practical effectiveness.

Technology Category

Application Category

📝 Abstract
Active learning methods aim to improve sample complexity in machine learning. In this work, we investigate an active learning scheme via a novel gradient-free cutting-plane training method for ReLU networks of arbitrary depth and develop a convergence theory. We demonstrate, for the first time, that cutting-plane algorithms, traditionally used in linear models, can be extended to deep neural networks despite their nonconvexity and nonlinear decision boundaries. Moreover, this training method induces the first deep active learning scheme known to achieve convergence guarantees, revealing a geometric contraction rate of the feasible set. We exemplify the effectiveness of our proposed active learning method against popular deep active learning baselines via both synthetic data experiments and sentimental classification task on real datasets.
Problem

Research questions and friction points this paper is trying to address.

Active learning for deep neural networks
Gradient-free cutting-plane method
Convergence guarantees in nonconvex models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gradient-free cutting-plane training
Deep active learning convergence
Nonconvex neural network application
🔎 Similar Papers
No similar papers found.