RELICT: A Replica Detection Framework for Medical Image Generation

📅 2025-02-24
📈 Citations: 0
✹ Influential: 0
📄 PDF
đŸ€– AI Summary
Medical image generation models often memorize training data, posing significant patient privacy risks. To address this, we propose the first three-tiered (voxel-level, feature-level, and segmentation-level) duplication detection framework specifically designed for medical imaging. Our method integrates voxel-wise similarity metrics, feature comparisons using pretrained medical foundation models, and lesion-level segmentation consistency analysis to precisely identify exact replicas of original images within synthetic outputs. Evaluated on non-contrast CT (NCCT) and time-of-flight magnetic resonance angiography (TOF-MRA) datasets, the framework achieves 100% detection accuracy on NCCT and a balanced accuracy of 79% at the segmentation level on TOF-MRA—substantially outperforming unimodal baselines. This work bridges a critical gap in ethical validation for healthcare AI by enabling rigorous privacy auditing of generative models, while preserving model utility for clinical applications.

Technology Category

Application Category

📝 Abstract
Despite the potential of synthetic medical data for augmenting and improving the generalizability of deep learning models, memorization in generative models can lead to unintended leakage of sensitive patient information and limit model utility. Thus, the use of memorizing generative models in the medical domain can jeopardize patient privacy. We propose a framework for identifying replicas, i.e. nearly identical copies of the training data, in synthetic medical image datasets. Our REpLIca deteCTion (RELICT) framework for medical image generative models evaluates image similarity using three complementary approaches: (1) voxel-level analysis, (2) feature-level analysis by a pretrained medical foundation model, and (3) segmentation-level analysis. Two clinically relevant 3D generative modelling use cases were investigated: non-contrast head CT with intracerebral hemorrhage (N=774) and time-of-flight MR angiography of the Circle of Willis (N=1,782). Expert visual scoring was used as the reference standard to assess the presence of replicas. We report the balanced accuracy at the optimal threshold to assess replica classification performance. The reference visual rating identified 45 of 50 and 5 of 50 generated images as replicas for the NCCT and TOF-MRA use cases, respectively. Image-level and feature-level measures perfectly classified replicas with a balanced accuracy of 1 when an optimal threshold was selected for the NCCT use case. A perfect classification of replicas for the TOF-MRA case was not possible at any threshold, with the segmentation-level analysis achieving a balanced accuracy of 0.79. Replica detection is a crucial but neglected validation step for the development of generative models in medical imaging. The proposed RELICT framework provides a standardized, easy-to-use tool for replica detection and aims to facilitate responsible and ethical medical image synthesis.
Problem

Research questions and friction points this paper is trying to address.

Detects replicas in synthetic medical images.
Ensures patient privacy in generative models.
Improves validation of medical imaging models.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Voxel-level analysis for image similarity
Feature-level analysis via pretrained model
Segmentation-level analysis for replica detection
🔎 Similar Papers
No similar papers found.
O
Orhun Utku Aydin
CLAIM - CharitĂ© Lab for AI in Medicine, CharitĂ© – UniversitĂ€tsmedizin Berlin, corporate member of Freie UniversitĂ€t Berlin and Humboldt-UniversitĂ€t zu Berlin, CharitĂ©platz 1, 10117, Berlin, Germany
A
Alexander Koch
CLAIM - CharitĂ© Lab for AI in Medicine, CharitĂ© – UniversitĂ€tsmedizin Berlin, corporate member of Freie UniversitĂ€t Berlin and Humboldt-UniversitĂ€t zu Berlin, CharitĂ©platz 1, 10117, Berlin, Germany
Adam Hilbert
Adam Hilbert
Charite UniversitÀtsmedizin, Berlin
Deep LearningMedical ImagingData-efficient CNNs
J
Jana Rieger
CLAIM - CharitĂ© Lab for AI in Medicine, CharitĂ© – UniversitĂ€tsmedizin Berlin, corporate member of Freie UniversitĂ€t Berlin and Humboldt-UniversitĂ€t zu Berlin, CharitĂ©platz 1, 10117, Berlin, Germany
F
Felix Lohrke
CLAIM - CharitĂ© Lab for AI in Medicine, CharitĂ© – UniversitĂ€tsmedizin Berlin, corporate member of Freie UniversitĂ€t Berlin and Humboldt-UniversitĂ€t zu Berlin, CharitĂ©platz 1, 10117, Berlin, Germany
F
Fujimaro Ishida
Department of Neurosurgery, Mie Chuo Medical Center, 2158-5 Myojin-cho, 514-1101, Hisai, Tsu, Japan
S
Satoru Tanioka
Department of Neurosurgery, Mie University Graduate School of Medicine, 2-174 Edobashi, 514-8507, Tsu, Japan; CLAIM - CharitĂ© Lab for AI in Medicine, CharitĂ© – UniversitĂ€tsmedizin Berlin, corporate member of Freie UniversitĂ€t Berlin and Humboldt-UniversitĂ€t zu Berlin, CharitĂ©platz 1, 10117, Berlin, Germany
Dietmar Frey
Dietmar Frey
Director Charité Lab for AI in Medicine
Machine/Deep LearningGANStrokeCerebrocascular DiseaseChronic Diseases