Efficient and Noise-Tolerant PAC Learning of Multiclass Linear Classifiers

📅 2026-05-18

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses the challenge of computationally efficient PAC learning for multiclass linear classification under malicious noise. It proposes the first provably robust and computationally efficient algorithm, which integrates a clustering-based pruning strategy with multiclass hinge loss minimization. The method operates under realistic distributional assumptions—specifically, when the marginal distribution is a bounded-variance mixture satisfying a margin condition. Under constant-level malicious noise, the algorithm achieves PAC learnability with only $O(k^2(d \log d + \log k))$ samples, where $k$ is the number of classes and $d$ the ambient dimension. This result establishes the first efficient robust learning guarantee for multiclass linear classification under malicious noise and notably improves upon existing bounds even in the binary classification setting.

📝 Abstract

Noise-tolerant PAC learning of linear models has been of central interests in machine learning community since the last century. In recent years, many computationally-efficient algorithms have been proposed for the problem of learning linear threshold functions under multiple noise models. Yet, when the problem is considered under multiclass learning settings, i.e. when the number of classes $k$ is at least $3$, it is unknown whether there exist computationally-efficient PAC learning algorithms when the data sets are maliciously corrupted. In this paper, we consider that the marginal distribution is a mixture of bounded variance distributions and the data sets satisfy a margin condition at the same time. We show that there exists a computationally-efficient algorithm that PAC learns multiclass linear classifiers $\{h_w:x\mapsto \arg\max_{y\in[k]}w_y\cdot x, x\in \mathbb{R}^d, w\in\mathbb{R}^{kd}\}$ using at most $O(k^2\cdot (d\log d+\log k))$ samples even under a constant rate of nasty noise. Our algorithm consists of two main ingredients: a cluster-based pruning scheme and a standard multiclass hinge loss minimization program. Even in the special case of binary setting, i.e. $k=2$, our result is strictly stronger than all prior works.

Problem

Research questions and friction points this paper is trying to address.

multiclass linear classifiers

noise-tolerant PAC learning

malicious noise

computational efficiency

PAC learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

noise-tolerant PAC learning

multiclass linear classifiers

nasty noise