🤖 AI Summary
Existing membership inference attacks (MIAs) against tabular generative models lack theoretical guarantees for optimal selection under privacy auditing, leading to unreliable performance in practice. Method: This paper establishes the first decision-theoretic formulation of MIA optimality and proposes a prior-free, unsupervised ensemble attack framework. It integrates heterogeneous baseline MIAs—leveraging diverse privacy signals such as model output scores and prediction confidence—via weighted fusion and rank aggregation to ensure robustness and mitigate the uncertainty inherent in individual attacks. Contribution/Results: The proposed non-dominant single-attack strategy significantly reduces regret risk under realistic threat assumptions. Evaluated on the largest publicly available synthetic data privacy benchmark, the ensemble consistently outperforms all individual baselines across diverse generative models and data domains, demonstrating both practical utility and strong generalization capability.
📝 Abstract
Membership Inference Attacks (MIAs) have emerged as a principled framework for auditing the privacy of synthetic data generated by tabular generative models, where many diverse methods have been proposed that each exploit different privacy leakage signals. However, in realistic threat scenarios, an adversary must choose a single method without a priori guarantee that it will be the empirically highest performing option. We study this challenge as a decision theoretic problem under uncertainty and conduct the largest synthetic data privacy benchmark to date. Here, we find that no MIA constitutes a strictly dominant strategy across a wide variety of model architectures and dataset domains under our threat model. Motivated by these findings, we propose ensemble MIAs and show that unsupervised ensembles built on individual attacks offer empirically more robust, regret-minimizing strategies than individual attacks.