🤖 AI Summary
Efficient watermark detection in generated images faces scalability bottlenecks under large-scale deployment. To address this, we propose an end-to-end, adaptive detection system optimized for throughput and robustness. Our method partitions input images into tiles for parallel processing, employs a Reed–Solomon error-correcting code embedded in a QR-code–like structure to mitigate accuracy degradation from tiling, and introduces a resource-aware streaming scheduler that orchestrates GPU computation, data loading, and inter-stage kernel execution via fine-grained pipeline overlap. These techniques jointly enable high-throughput, low-latency inference without compromising watermark detection robustness. Experimental evaluation demonstrates a 2.43× average speedup in end-to-end inference latency over sequential baselines, while maintaining detection accuracy across diverse generative models and perturbations. The system thus enhances practical deployability and horizontal scalability for real-world watermark monitoring at scale.
📝 Abstract
Efficient and reliable detection of generated images is critical for the responsible deployment of generative models. Existing approaches primarily focus on improving detection accuracy and robustness under various image transformations and adversarial manipulations, yet they largely overlook the efficiency challenges of watermark detection across large-scale image collections. To address this gap, we propose QRMark, an efficient and adaptive end-to-end method for detecting embedded image watermarks. The core idea of QRMark is to combine QR Code inspired error correction with tailored tiling techniques to improve detection efficiency while preserving accuracy and robustness. At the algorithmic level, QRMark employs a Reed-Solomon error correction mechanism to mitigate the accuracy degradation introduced by tiling. At the system level, QRMark implements a resource-aware stream allocation policy that adaptively assigns more streams to GPU-intensive stages of the detection pipeline. It further employs a tile-based workload interleaving strategy to overlap data-loading overhead with computation and schedules kernels across stages to maximize efficiency. End-to-end evaluations show that QRMark achieves an average 2.43x inference speedup over the sequential baseline.