A hierarchical approach for assessing the vulnerability of tree-based classification models to membership inference attack

📅 2025-02-13

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This paper addresses the challenge of evaluating the vulnerability of tree-based classification models to membership inference attacks (MIAs) without access to sensitive training data attributes. The proposed lightweight, hierarchical evaluation framework operates in two phases: (1) an ante-hoc, hyperparameter-based risk ranking that leverages cross-dataset-consistent hyperparameter sensitivity rules for zero-cost, pre-training risk screening; and (2) a post-hoc structural filtering stage—using metrics such as tree depth, number of leaf nodes, and impurity distribution—to drastically reduce the number of models requiring shadow-model-based evaluation. Its key contributions include the first interpretable, hyperparameter-level predictive model for MIA risk and empirical validation that model accuracy is not significantly correlated with privacy risk—enabling joint performance–privacy optimization. Experiments demonstrate that the method maintains high prediction accuracy while substantially reducing evaluation overhead.

Technology Category

Application Category

📝 Abstract

Machine learning models can inadvertently expose confidential properties of their training data, making them vulnerable to membership inference attacks (MIA). While numerous evaluation methods exist, many require computationally expensive processes, such as training multiple shadow models. This article presents two new complementary approaches for efficiently identifying vulnerable tree-based models: an ante-hoc analysis of hyperparameter choices and a post-hoc examination of trained model structure. While these new methods cannot certify whether a model is safe from MIA, they provide practitioners with a means to significantly reduce the number of models that need to undergo expensive MIA assessment through a hierarchical filtering approach. More specifically, it is shown that the rank order of disclosure risk for different hyperparameter combinations remains consistent across datasets, enabling the development of simple, human-interpretable rules for identifying relatively high-risk models before training. While this ante-hoc analysis cannot determine absolute safety since this also depends on the specific dataset, it allows the elimination of unnecessarily risky configurations during hyperparameter tuning. Additionally, computationally inexpensive structural metrics serve as indicators of MIA vulnerability, providing a second filtering stage to identify risky models after training but before conducting expensive attacks. Empirical results show that hyperparameter-based risk prediction rules can achieve high accuracy in predicting the most at risk combinations of hyperparameters across different tree-based model types, while requiring no model training. Moreover, target model accuracy is not seen to correlate with privacy risk, suggesting opportunities to optimise model configurations for both performance and privacy.

Problem

Research questions and friction points this paper is trying to address.

assessing vulnerability

tree-based models

membership inference attack

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical filtering approach

Ante-hoc hyperparameter analysis

Post-hoc structural metrics

🔎 Similar Papers

No similar papers found.