🤖 AI Summary
This paper studies differentially private (DP) empirical risk minimization (DP-ERM) for binary linear classification under the assumption that the data are linearly separable with a small number of outliers. We propose the first adaptive DP algorithm that neither requires prior knowledge of the margin γ nor of the outlier set S_out, yet automatically identifies and exploits large-margin substructures. Our method integrates robust ERM with a tailored DP mechanism, enabling theoretically grounded hyperparameter tuning under privacy constraints. We establish a zero-one loss upper bound of Õ(1/(γ²εn) + |S_out|/(γn)), which significantly improves upon existing DP-ERM results when |S_out| is small. The key contribution is the first adaptive DP learning framework that jointly adapts to unknown large-margin structure and sparse outliers—achieving privacy preservation without sacrificing statistical efficiency in benign regimes.
📝 Abstract
This paper studies the problem of differentially private empirical risk minimization (DP-ERM) for binary linear classification. We obtain an efficient $(varepsilon,delta)$-DP algorithm with an empirical zero-one risk bound of $ ilde{O}left(frac{1}{gamma^2varepsilon n} + frac{|S_{mathrm{out}}|}{gamma n}
ight)$ where $n$ is the number of data points, $S_{mathrm{out}}$ is an arbitrary subset of data one can remove and $gamma$ is the margin of linear separation of the remaining data points (after $S_{mathrm{out}}$ is removed). Here, $ ilde{O}(cdot)$ hides only logarithmic terms. In the agnostic case, we improve the existing results when the number of outliers is small. Our algorithm is highly adaptive because it does not require knowing the margin parameter $gamma$ or outlier subset $S_{mathrm{out}}$. We also derive a utility bound for the advanced private hyperparameter tuning algorithm.