π€ AI Summary
To address high barriers to knowledge graph (KG) construction, opaque workflows, challenges in ontology alignment, and poor reproducibility, this paper introduces PyRMLβthe first native Python RML mapping execution engine. PyRML tightly integrates declarative RML semantics with software engineering best practices, enabling unified mapping definition, execution, and unit testing. Leveraging Pandas and RDFlib, it supports programmable, reusable, and verifiable transformation of structured and semi-structured data into RDF. Its modular architecture enforces ontology constraints and ensures workflow reproducibility. Experimental evaluation demonstrates that PyRML significantly reduces KG construction complexity, improves mapping development efficiency and maintainability, and validates transparent, ontology-aligned, and reusable data integration capabilities in domains including climate science and cultural heritage.
π Abstract
Knowledge Graphs (KGs) are increasingly adopted as a foundational technology for integrating heterogeneous data in domains such as climate science, cultural heritage, and the life sciences. Declarative mapping languages like R2RML and RML have played a central role in enabling scalable and reusable KG construction, offering a transparent means of transforming structured and semi-structured data into RDF. In this paper, we present PyRML, a lightweight, Python-native library for building Knowledge Graphs through declarative mappings. PyRML supports core RML constructs and provides a programmable interface for authoring, executing, and testing mappings directly within Python environments. It integrates with popular data and semantic web libraries (e.g., Pandas and RDFlib), enabling transparent and modular workflows. By lowering the barrier to entry for KG creation and fostering reproducible, ontology-aligned data integration, PyRML bridges the gap between declarative semantics and practical KG engineering.