PRIVMARK: Private Large Language Models Watermarking with MPC

📅 2025-09-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing watermarking techniques for large language model (LLM) content provenance pose significant privacy risks, as they typically require access to model parameters or training data. Method: We propose PRIVMARK—the first privacy-preserving watermarking framework for LLMs based on secure multi-party computation (MPC). PRIVMARK enables multiple parties to collaboratively generate watermarks in a black-box setting without exposing model weights, leveraging SecretFlow-SPU and ABY3 to implement efficient cryptographic protocols tailored for the PostMark watermarking method. Contribution/Results: Extensive experiments demonstrate that PRIVMARK achieves performance parity with plaintext baselines in semantic coherence, robustness against paraphrasing, and resistance to watermark removal attacks, while providing strong privacy guarantees and maintaining watermark robustness. This work marks the first application of MPC to LLM watermarking, establishing a foundational approach for privacy-aware content attribution.

Technology Category

Application Category

📝 Abstract
The rapid growth of Large Language Models (LLMs) has highlighted the pressing need for reliable mechanisms to verify content ownership and ensure traceability. Watermarking offers a promising path forward, but it remains limited by privacy concerns in sensitive scenarios, as traditional approaches often require direct access to a model's parameters or its training data. In this work, we propose a secure multi-party computation (MPC)-based private LLMs watermarking framework, PRIVMARK, to address the concerns. Concretely, we investigate PostMark (EMNLP'2024), one of the state-of-the-art LLMs Watermarking methods, and formulate its basic operations. Then, we construct efficient protocols for these operations using the MPC primitives in a black-box manner. In this way, PRIVMARK enables multiple parties to collaboratively watermark an LLM's output without exposing the model's weights to any single computing party. We implement PRIVMARK using SecretFlow-SPU (USENIX ATC'2023) and evaluate its performance using the ABY3 (CCS'2018) backend. The experimental results show that PRIVMARK achieves semantically identical results compared to the plaintext baseline without MPC and is resistant against paraphrasing and removing attacks with reasonable efficiency.
Problem

Research questions and friction points this paper is trying to address.

Enables private watermarking of LLM outputs without exposing model parameters
Addresses privacy concerns in sensitive watermarking scenarios using MPC
Allows collaborative watermarking while protecting model weights from disclosure
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses secure multi-party computation for watermarking
Enables collaborative watermarking without exposing model weights
Achieves identical results to plaintext baseline with MPC
🔎 Similar Papers
No similar papers found.
T
Thomas Fargues
Télécom SudParis, France
Ye Dong
Ye Dong
Research Fellow, National University of Singapore (NUS)
Secure Multi-Party ComputationPrivacy-Preserving Machine LearningFederated Learning
T
Tianwei Zhang
Nanyang Technological University, Singapore
J
Jin-Song Dong
National University of Singapore