Robust Learning with Optimal Error

📅 2026-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the problem of error-optimal robust learning under adversarial noise, including malicious, nasty, and agnostic noise models. By replacing traditional deterministic assumptions with a randomized hypothesis framework, the authors design learning algorithms that achieve optimal error bounds across these settings: for the first time, they reduce the error under malicious noise to ½·η/(1−η); under nasty noise, they improve the distribution-free error from 2η to 3η/2 and further to η under a fixed distribution; and they attain error η in the agnostic setting. Based on VC dimension theory, their algorithms exhibit sample complexity linear in the VC dimension and polynomial in the inverse of the excess error. Except for the fixed-distribution nasty noise case, all algorithms run efficiently, significantly outperforming existing deterministic approaches.
📝 Abstract
We construct algorithms with optimal error for learning with adversarial noise. The overarching theme of this work is that the use of \textsl{randomized} hypotheses can substantially improve upon the best error rates achievable with deterministic hypotheses. - For $η$-rate malicious noise, we show the optimal error is $\frac{1}{2} \cdot η/(1-η)$, improving on the optimal error of deterministic hypotheses by a factor of $1/2$. This answers an open question of Cesa-Bianchi et al. (JACM 1999) who showed randomness can improve error by a factor of $6/7$. - For $η$-rate nasty noise, we show the optimal error is $\frac{3}{2} \cdot η$ for distribution-independent learners and $η$ for fixed-distribution learners, both improving upon the optimal $2 η$ error of deterministic hypotheses. This closes a gap first noted by Bshouty et al. (Theoretical Computer Science 2002) when they introduced nasty noise and reiterated in the recent works of Klivans et al. (NeurIPS 2025) and Blanc et al. (SODA 2026). - For $η$-rate agnostic noise and the closely related nasty classification noise model, we show the optimal error is $η$, improving upon the optimal $2η$ error of deterministic hypotheses. All of our learners have sample complexity linear in the VC-dimension of the concept class and polynomial in the inverse excess error. All except for the fixed-distribution nasty noise learner are time efficient given access to an oracle for empirical risk minimization.
Problem

Research questions and friction points this paper is trying to address.

adversarial noise
optimal error
randomized hypotheses
malicious noise
agnostic learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

randomized hypotheses
adversarial noise
optimal error
malicious noise
nasty noise
🔎 Similar Papers
No similar papers found.