🤖 AI Summary
Existing image watermarking methods struggle to support independent embedding and precise localization of watermarks in small or multiple localized regions, limiting their applicability in practical scenarios such as image splicing and editing. This paper proposes the first end-to-end differentiable local watermarking framework, enabling implicit embedding and robust extraction of independent 32-bit messages within arbitrary subregions—down to ≤10% of a 256×256 image area. We introduce a jointly trained encoder–extractor architecture augmented with a differentiable segmentation module and perception-aware optimization, facilitating fine-grained watermark localization and parallel decoding across multiple regions. Experiments demonstrate state-of-the-art imperceptibility and robustness on high-resolution images; accurate localization of watermarked regions in spliced images; and an average bit error rate of <1 bit per message across multiple regions—significantly outperforming existing approaches.
📝 Abstract
Image watermarking methods are not tailored to handle small watermarked areas. This restricts applications in real-world scenarios where parts of the image may come from different sources or have been edited. We introduce a deep-learning model for localized image watermarking, dubbed the Watermark Anything Model (WAM). The WAM embedder imperceptibly modifies the input image, while the extractor segments the received image into watermarked and non-watermarked areas and recovers one or several hidden messages from the areas found to be watermarked. The models are jointly trained at low resolution and without perceptual constraints, then post-trained for imperceptibility and multiple watermarks. Experiments show that WAM is competitive with state-of-the art methods in terms of imperceptibility and robustness, especially against inpainting and splicing, even on high-resolution images. Moreover, it offers new capabilities: WAM can locate watermarked areas in spliced images and extract distinct 32-bit messages with less than 1 bit error from multiple small regions - no larger than 10% of the image surface - even for small $256 imes 256$ images.