🤖 AI Summary
This paper challenges the prevailing overestimation of the practical threat posed by privacy attacks—namely membership inference, attribute inference, and model inversion/reconstruction attacks—against machine learning models. Through conceptual analysis, it systematically evaluates the feasibility, data dependencies, and technical limitations of these attacks under realistic deployment conditions. The study finds that most attacks require stringent assumptions (e.g., strong white-box access, specific data distributions, or auxiliary side information), leading to substantially diminished success rates under real-world constraints. Crucially, model release does not inherently imply training data leakage; the “model-as-leakage” assumption lacks broad empirical or theoretical justification. The primary contribution is the first empirically grounded delineation of the boundary conditions under which privacy attacks remain viable, offering a more rigorous, context-aware framework for privacy risk assessment. This work provides academically sound, operationally relevant foundations for regulatory policy formulation and industry-grade defensive strategy design.
📝 Abstract
In several jurisdictions, the regulatory framework on the release and sharing of personal data is being extended to machine learning (ML). The implicit assumption is that disclosing a trained ML model entails a privacy risk for any personal data used in training comparable to directly releasing those data. However, given a trained model, it is necessary to mount a privacy attack to make inferences on the training data. In this concept paper, we examine the main families of privacy attacks against predictive and generative ML, including membership inference attacks (MIAs), property inference attacks, and reconstruction attacks. Our discussion shows that most of these attacks seem less effective in the real world than what a prima face interpretation of the related literature could suggest.