Assessing metadata privacy in neuroimaging

📅 2025-09-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses metadata privacy risks in neuroimaging data sharing by systematically evaluating re-identification vulnerabilities in publicly available BIDS-formatted datasets. We developed and applied metaprivBIDS—a novel tool enabling the first automated, standardized privacy audit of tabular metadata (e.g., demographics, clinical scores)—integrating statistical and semantic analyses to detect cross-population differences in de-identification efficacy. Results indicate low re-identification risk for clinical scores, whereas demographic variables—including age, sex, and nationality—constitute the primary privacy bottleneck. While most datasets exhibit no critical vulnerabilities, widespread mild information leakage persists and remains exploitable. Based on these findings, we propose a tiered mitigation strategy. This work establishes a reproducible, scalable privacy assessment framework for neuroscientific data governance, grounded in empirical evidence and aligned with FAIR and GDPR principles.

Technology Category

Application Category

📝 Abstract
The ethical and legal imperative to share research data without causing harm requires careful attention to privacy risks. While mounting evidence demonstrates that data sharing benefits science, legitimate concerns persist regarding the potential leakage of personal information that could lead to reidentification and subsequent harm. We reviewed metadata accompanying neuroimaging datasets from six heterogeneous studies openly available on OpenNeuro, involving participants across the lifespan, from children to older adults, with and without clinical diagnoses, and including associated clinical score data. Using metaprivBIDS (https://github.com/CPernet/metaprivBIDS), a novel tool for the systematic assessment of privacy in tabular data, we found that privacy is generally well maintained, with serious vulnerabilities being rare. Nonetheless, minor issues were identified in nearly all datasets and warrant mitigation. Notably, clinical score data (e.g., neuropsychological results) posed minimal reidentification risk, whereas demographic variables (age, sex, race, income, and geolocation) represented the principal privacy vulnerabilities. We outline practical measures to address these risks, enabling safer data sharing practices.
Problem

Research questions and friction points this paper is trying to address.

Assessing metadata privacy risks in neuroimaging data sharing
Evaluating reidentification vulnerabilities from demographic and clinical data
Developing practical measures for safer neuroimaging data sharing
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed metaprivBIDS tool for privacy assessment
Analyzed demographic variables as main privacy risks
Proposed mitigation measures for safer data sharing
🔎 Similar Papers
No similar papers found.
E
Emilie Kibsgaard
Neurobiology Research Unit, Copenhagen University Hospital, Copenhagen, Denmark
A
Anita Sue Jwa
Advanced Computing and e-Science Group, Instituto de Física de Cantabria, Spain
Christopher J Markiewicz
Christopher J Markiewicz
Stanford University
NeuroimagingOpen Science
D
David Rodriguez Gonzalez
Department of Psychology , Stanford University , Stanford, CA, United States
J
Judith Sainz Pardo
Department of Psychology , Stanford University , Stanford, CA, United States
R
Russell A. Poldrack
Advanced Computing and e-Science Group, Instituto de Física de Cantabria, Spain
C
Cyril R. Pernet
Neurobiology Research Unit, Copenhagen University Hospital, Copenhagen, Denmark