🤖 AI Summary
Diffusion models may inadvertently memorize training samples during training, posing privacy and copyright risks, yet existing approaches struggle to precisely localize memorized regions within images. This work proposes a local memorization localization method based on coordinate-wise curvature discrepancy: by comparing the curvature of the target model against that of an underfitted baseline—such as an unconditional model or an early training snapshot—it disentangles memorization signals induced by overfitting from the intrinsic data structure. The authors further derive score differences as an efficient proxy for curvature differences. Experiments on Stable Diffusion demonstrate that this approach significantly outperforms existing attention-based methods in accurately identifying genuine memorized regions, while offering a novel geometric interpretation of memorization detection.
📝 Abstract
Diffusion models can unintentionally memorize training samples, raising concerns about privacy and copyright. While recent methods can detect memorization, they often rely on global or model-specific signals and provide limited insight into where memorization appears within a generated image. We provide a geometric characterization of local memorization as a coordinate-wise variance collapse. However, such collapse can also arise from intrinsic data constraints rather than overfitting. To isolate overfitting-driven memorization, we propose curvature-difference methods that subtract the curvature of an underfitted baseline, either the unconditional model or a less-trained version of itself. We further derive a score-difference proxy that provides a geometric explanation for the widely used score-difference-based detection metric. Experiments on Stable Diffusion, evaluated against ground-truth memorization masks, show that our method outperforms the prior attention-based localization method. Code is available at https://github.com/Gwangho99/mem-curv-diff.