🤖 AI Summary
This work studies empirical and population risk minimization for convex loss functions under differential privacy (DP) constraints, assuming access to a private gradient oracle. It establishes tight lower bounds on the gradient query complexity—the number of oracle calls required to achieve α-excess risk—under both nonsmooth and smooth Lipschitz losses. The analysis yields the first dimension-dependent lower bounds that are tight up to constant factors and extend to information-constrained oracles, such as quantized gradients. Methodologically, the paper integrates tools from differential privacy theory, stochastic optimization, and information theory; upper bounds are validated via variants of DP-SGD. Results show that existing algorithms are nearly optimal in high dimensions. Crucially, privacy enforcement incurs an inherent computational cost: Ω(d/α²) queries for nonsmooth losses and Ω(d/α) for smooth ones—revealing a fundamental trade-off between privacy guarantees and optimization efficiency.
📝 Abstract
We study the running time, in terms of first order oracle queries, of differentially private empirical/population risk minimization of Lipschitz convex losses. We first consider the setting where the loss is non-smooth and the optimizer interacts with a private proxy oracle, which sends only private messages about a minibatch of gradients. In this setting, we show that expected running time $Ω(min{frac{sqrt{d}}{α^2}, frac{d}{log(1/α)}})$ is necessary to achieve $α$ excess risk on problems of dimension $d$ when $d geq 1/α^2$. Upper bounds via DP-SGD show these results are tight when $d> ildeΩ(1/α^4)$. We further show our lower bound can be strengthened to $Ω(min{frac{d}{ar{m}α^2}, frac{d}{log(1/α)} })$ for algorithms which use minibatches of size at most $ar{m} < sqrt{d}$. We next consider smooth losses, where we relax the private oracle assumption and give lower bounds under only the condition that the optimizer is private. Here, we lower bound the expected number of first order oracle calls by $ ildeΩig(frac{sqrt{d}}α + min{frac{1}{α^2}, n}ig)$, where $n$ is the size of the dataset. Modifications to existing algorithms show this bound is nearly tight. Compared to non-private lower bounds, our results show that differentially private optimizers pay a dimension dependent runtime penalty. Finally, as a natural extension of our proof technique, we show lower bounds in the non-smooth setting for optimizers interacting with information limited oracles. Specifically, if the proxy oracle transmits at most $Γ$-bits of information about the gradients in the minibatch, then $Ωig(min{frac{d}{α^2Γ}, frac{d}{log(1/α)}}ig)$ oracle calls are needed. This result shows fundamental limitations of gradient quantization techniques in optimization.