Implementing a Scalable, Redeployable and Multitiered Repository for FAIR and Secure Scientific Data Sharing: The BIG-MAP Archive

📅 2025-12-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address scalability, secure access control, and FAIR compliance challenges in cross-organizational data sharing within large scientific consortia, this work designs and implements a cloud-native, multi-tier scientific data archiving system built on InvenioRDM. Methodologically, it introduces a novel, consortium-oriented hybrid RBAC+ABAC permission model with fine-grained authorization and a formalized upload workflow, enabling community-level isolation, sensitive-data protection, and automated publication to open repositories. The system integrates standardized metadata management, automatic format validation, and a publication-ready curation pipeline, ensuring both high customizability and cross-consortium reusability. Evaluated in European AI and materials science initiatives—including BIG-MAP, MaterialsCommons4.eu, and RAISE—the system demonstrably enhances data findability, interoperability, and reuse efficiency while maintaining strict security and compliance requirements.

Technology Category

Application Category

📝 Abstract
Data sharing in large consortia, such as research collaborations or industry partnerships, requires addressing both organizational and technical challenges. A common platform is essential to promote collaboration, facilitate exchange of findings, and ensure secure access to sensitive data. Key technical challenges include creating a scalable architecture, a user-friendly interface, and robust security and access control. The BIG-MAP Archive is a cloud-based, disciplinary, private repository designed to address these challenges. Built on InvenioRDM, it leverages platform functionalities to meet consortium-specific needs, providing a tailored solution compared to general repositories. Access can be restricted to members of specific communities or open to the entire consortium, such as the BATTERY 2030+, a consortium accelerating advanced battery technologies. Uploaded data and metadata are controlled via fine grained permissions, allowing access to individual project members or the full initiative. The formalized upload process ensures data are formatted and ready for publication in open repositories when needed. This paper reviews the repository's key features, showing how the BIG-MAP Archive enables secure, controlled data sharing within large consortia. It ensures data confidentiality while supporting flexible, permissions-based access and can be easily redeployed for other consortia, including MaterialsCommons4.eu and RAISE (Resource for AI Science in Europe).
Problem

Research questions and friction points this paper is trying to address.

Designing a scalable, secure repository for scientific data sharing
Enabling controlled access within large research consortia
Supporting flexible redeployment for diverse collaborative initiatives
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cloud-based disciplinary private repository for secure data sharing
Built on InvenioRDM with fine-grained access control permissions
Redeployable scalable architecture tailored for large consortia
🔎 Similar Papers
No similar papers found.
V
Valeria Granata
Theory and Simulation of Materials (THEOS) and National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
F
Francois Liot
Theory and Simulation of Materials (THEOS) and National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
X
Xing Wang
PSI Center for Scientific Computing, Theory and Data, Paul Scherrer Institute, 5232 Villigen PSI, Switzerland
S
Steen Lysgaard
Department of Energy Conversion and Storage, Technical University of Denmark, DK 2800 Kgs. Lyngby, Denmark
Ivano E. Castelli
Ivano E. Castelli
Department of Energy Conversion and Storage, Technical University of Denmark, DK 2800 Kgs. Lyngby, Denmark
Tejs Vegge
Tejs Vegge
Professor, Technical University of Denmark
Director of CAPeX - Pioneer Center for Accelerating P2X Materials Discovery
Nicola Marzari
Nicola Marzari
Theory and Simulation of Materials (THEOS) and National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland; PSI Center for Scientific Computing, Theory and Data, Paul Scherrer Institute, 5232 Villigen PSI, Switzerland
Giovanni Pizzi
Giovanni Pizzi
Laboratory for Materials Simulations, Paul Scherrer Institute (PSI), Villigen PSI, Switzerland
Solid-state PhysicsMaterials ScienceMaterials simulations