🤖 AI Summary
To address the challenge of trustworthy cross-organizational metadata discovery and verification in trusted data spaces, this paper proposes a semantic metadata catalog architecture integrating W3C Verifiable Credentials (VCs), semantic compliance validation (RDF/SHACL), and graph-database indexing (Neo4j). The system enables end-to-end management of DCAT-AP metadata and introduces the first implementation supporting publication, discovery, and cryptographically verifiable validation of Verifiable Presentations (VPs). It natively complies with Gaia-X standards and supports dynamic extension of semantic schemas. Deployed end-to-end in the German Cultural Data Space using the Eclipse Dataspace Components Connector, the system demonstrates efficient validation and semantic retrieval of over ten thousand VPs. Experimental results confirm significant improvements in metadata authenticity, security, and interoperability across heterogeneous organizations.
📝 Abstract
In dataspaces, federation services facilitate key functions such as enabling participating organizations to establish mutual trust and assisting them in discovering data and services available for consumption. Discovery is enabled by a catalogue, where participants publish metadata describing themselves and their data and service offerings as Verifiable Presentations (VPs), such that other participants may query them. This paper presents the Eclipse Cross Federation Services Components (XFSC) Catalogue, which originated as a catalogue reference implementation for the Gaia-X federated cloud service architecture but is also generally applicable to metadata required to be trustable. This implementation provides basic lifecycle management for DCAT-style metadata records and schemas. It validates submitted VPs for their cryptographic integrity and trustability, and for their conformance to an extensible collection of semantic schemas. The claims in the latest versions of valid VP submissions are extracted into a searchable graph database. The implementation scales to large numbers of records and is secure by design. Filling the catalogue with content in a maintainable way requires bindings towards where data and service offerings are coming from: connectors that expose resources hosted in an organization's IT infrastructure towards the dataspace. We demonstrate the integration of our catalogue with the widely used Eclipse Dataspace Components Connector, enabling real-world use cases of the German Culture Dataspace. In addition, we discuss potential extensions and upcoming integrations of the catalogue.