🤖 AI Summary
Existing training-free watermarking methods for diffusion-based image generation suffer from poor robustness against geometric attacks (e.g., rotation, scaling, translation—RST) and limited embedding capacity, leading to frequent ID collisions and inadequate copyright protection.
Method: We propose a training-free, high-capacity, RST-robust watermarking framework that embeds watermarks in the initial noise space of diffusion models. To enable reliable geometric alignment and watermark recovery under RST transformations, we introduce an X-shaped template—replacing conventional circular structures—and integrate noise-space modulation to jointly optimize capacity and detection accuracy.
Contribution/Results: Our method achieves state-of-the-art performance on two major watermarking benchmarks. It is the first training-free approach to simultaneously achieve >128-bit payload capacity and strong RST robustness. Moreover, it effectively mitigates identity collusion and supports verifiable copyright attribution and provenance tracing of generated images.
📝 Abstract
The great success of the diffusion model in image synthesis led to the release of gigantic commercial models, raising the issue of copyright protection and inappropriate content generation. Training-free diffusion watermarking provides a low-cost solution for these issues. However, the prior works remain vulnerable to rotation, scaling, and translation (RST) attacks. Although some methods employ meticulously designed patterns to mitigate this issue, they often reduce watermark capacity, which can result in identity (ID) collusion. To address these problems, we propose MaXsive, a training-free diffusion model generative watermarking technique that has high capacity and robustness. MaXsive best utilizes the initial noise to watermark the diffusion model. Moreover, instead of using a meticulously repetitive ring pattern, we propose injecting the X-shape template to recover the RST distortions. This design significantly increases robustness without losing any capacity, making ID collusion less likely to happen. The effectiveness of MaXsive has been verified on two well-known watermarking benchmarks under the scenarios of verification and identification.