Active Learning of Deep Neural Networks via Gradient-Free Cutting Planes

📅 2024-10-03

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the low sample efficiency and lack of theoretical guarantees in deep active learning. We propose the first gradient-free cutting-plane training framework for arbitrary deep ReLU networks. Our method abandons gradient-based optimization by explicitly modeling the piecewise-linear structure of ReLU networks, thereby extending the classical cutting-plane method to non-convex, non-linear settings, and designing an active querying strategy based on geometric contraction of the feasible set. Theoretically, we establish the first convergence guarantee for deep active learning, characterizing the feasible-set contraction rate and overcoming the long-standing limitation that cutting-plane methods apply only to convex models. Empirically, our approach significantly outperforms mainstream baselines on both synthetic benchmarks and real-world sentiment classification tasks, validating both theoretical convergence and practical effectiveness.

Technology Category

Application Category

📝 Abstract

Active learning methods aim to improve sample complexity in machine learning. In this work, we investigate an active learning scheme via a novel gradient-free cutting-plane training method for ReLU networks of arbitrary depth and develop a convergence theory. We demonstrate, for the first time, that cutting-plane algorithms, traditionally used in linear models, can be extended to deep neural networks despite their nonconvexity and nonlinear decision boundaries. Moreover, this training method induces the first deep active learning scheme known to achieve convergence guarantees, revealing a geometric contraction rate of the feasible set. We exemplify the effectiveness of our proposed active learning method against popular deep active learning baselines via both synthetic data experiments and sentimental classification task on real datasets.

Problem

Research questions and friction points this paper is trying to address.

Active learning for deep neural networks

Gradient-free cutting-plane method

Convergence guarantees in nonconvex models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gradient-free cutting-plane training

Deep active learning convergence

Nonconvex neural network application

🔎 Similar Papers

No similar papers found.