🤖 AI Summary
This paper addresses the agnostic boosting problem—converting weak learners into strong learners without assuming any parametric form of the label distribution. We propose the first constructive boosting algorithm that achieves a significant reduction in sample complexity for general hypothesis classes. Our method reduces the agnostic setting to the realizable one and introduces a margin-based quality filtering mechanism to efficiently select and combine weak hypotheses. The resulting framework attains near-optimal error rates: its excess risk degrades by only a logarithmic factor relative to the information-theoretic lower bound. This represents the current best sample complexity guarantee for agnostic boosting, substantially improving upon prior approaches.
📝 Abstract
Boosting is a key method in statistical learning, allowing for converting weak learners into strong ones. While well studied in the realizable case, the statistical properties of weak-to-strong learning remains less understood in the agnostic setting, where there are no assumptions on the distribution of the labels. In this work, we propose a new agnostic boosting algorithm with substantially improved sample complexity compared to prior works under very general assumptions. Our approach is based on a reduction to the realizable case, followed by a margin-based filtering step to select high-quality hypotheses. We conjecture that the error rate achieved by our proposed method is optimal up to logarithmic factors.