🤖 AI Summary
Face recognition systems risk leaking training data privacy through membership inference attacks.
Method: This paper proposes MINT, the first large-scale privacy auditing framework for facial images, featuring a dual-architecture discriminative model (MLP + CNN) that captures activation pattern disparities between member and non-member samples. MINT establishes a multi-source experimental framework—spanning diverse databases and state-of-the-art (SOTA) face recognition models—to enable cross-dataset and cross-model evaluation.
Contribution/Results: Evaluated on a real-world dataset of 22 million face images, MINT achieves up to 90% membership inference accuracy, substantially outperforming existing baselines. It is the first study to empirically demonstrate the feasibility of membership inference in large-scale, realistic facial recognition settings. By providing a deployable, scalable privacy assessment tool, MINT advances compliance auditing for training data used in large language models and vision models.
📝 Abstract
This article introduces the Membership Inference Test (MINT), a novel approach that aims to empirically assess if given data was used during the training of AI/ML models. Specifically, we propose two MINT architectures designed to learn the distinct activation patterns that emerge when an Audited Model is exposed to data used during its training process. These architectures are based on Multilayer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs). The experimental framework focuses on the challenging task of Face Recognition, considering three state-of-the-art Face Recognition systems. Experiments are carried out using six publicly available databases, comprising over 22 million face images in total. Different experimental scenarios are considered depending on the context of the AI model to test. Our proposed MINT approach achieves promising results, with up to 90% accuracy, indicating the potential to recognize if an AI model has been trained with specific data. The proposed MINT approach can serve to enforce privacy and fairness in several AI applications, e.g., revealing if sensitive or private data was used for training or tuning Large Language Models (LLMs).