DP-MGTD: Privacy-Preserving Machine-Generated Text Detection via Adaptive Differentially Private Entity Sanitization

📅 2026-01-08
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the trade-off between privacy preservation and detection performance in machine-generated text detection. The authors propose DP-MGTD, a novel framework that enhances the distinguishability between human- and machine-generated texts by injecting differentially private noise. It introduces an adaptive entity sanitization algorithm that dynamically allocates privacy budgets through a two-stage mechanism: the Laplace mechanism for numerical entities and the Exponential mechanism for textual entities, thereby balancing utility and privacy. Experimental results on MGTBench-2.0 demonstrate that the proposed method achieves near-perfect detection accuracy while providing rigorous differential privacy guarantees, significantly outperforming non-private baselines.

Technology Category

Application Category

📝 Abstract
The deployment of Machine-Generated Text (MGT) detection systems necessitates processing sensitive user data, creating a fundamental conflict between authorship verification and privacy preservation. Standard anonymization techniques often disrupt linguistic fluency, while rigorous Differential Privacy (DP) mechanisms typically degrade the statistical signals required for accurate detection. To resolve this dilemma, we propose \textbf{DP-MGTD}, a framework incorporating an Adaptive Differentially Private Entity Sanitization algorithm. Our approach utilizes a two-stage mechanism that performs noisy frequency estimation and dynamically calibrates privacy budgets, applying Laplace and Exponential mechanisms to numerical and textual entities respectively. Crucially, we identify a counter-intuitive phenomenon where the application of DP noise amplifies the distinguishability between human and machine text by exposing distinct sensitivity patterns to perturbation. Extensive experiments on the MGTBench-2.0 dataset show that our method achieves near-perfect detection accuracy, significantly outperforming non-private baselines while satisfying strict privacy guarantees.
Problem

Research questions and friction points this paper is trying to address.

Machine-Generated Text Detection
Privacy Preservation
Differential Privacy
Entity Sanitization
Authorship Verification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Differential Privacy
Machine-Generated Text Detection
Adaptive Privacy Budget
Entity Sanitization
Privacy-Preserving NLP
🔎 Similar Papers
No similar papers found.
L
Lionel Z. Wang
Nanyang Technological University
Y
Yusheng Zhao
University of Science and Technology of China
J
Jiabin Luo
Peking University
X
Xinfeng Li
Nanyang Technological University
Lixu Wang
Lixu Wang
Northwestern University
Machine LearningData Privacy
Y
Yinan Peng
Hengxin Tech.
H
Haoyang Li
The Hong Kong Polytechnic University
X
Xiaofeng Wang
Nanyang Technological University
W
Wei Dong
Nanyang Technological University