Representing and querying data tensors in RDF and SPARQL

📅 2025-04-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of natively representing and efficiently querying tensor data—arising from the integration of knowledge graphs (KGs) and machine learning (ML)—within RDF-based systems. Methodologically, it (1) introduces a lightweight RDF tensor literal syntax and serialization format; (2) defines 36 tensor-specific SPARQL functions and four classes of tensor-aware aggregation operations; and (3) implements an open-source SPARQL engine on Apache Jena, incorporating RDF Schema extensions, SPARQL 1.2 syntax enhancements, and optimized tensor indexing. Experimental results demonstrate substantial improvements in both query efficiency and expressive power for joint KG–embedding-space queries. The contribution includes a publicly available benchmark suite, exemplar tensor-augmented knowledge graphs, and a comprehensive validation framework. To the best of our knowledge, this is the first production-ready, tensor-aware knowledge graph infrastructure designed specifically to support ML-KG hybrid applications.

Technology Category

Application Category

📝 Abstract
Embedding tensors in databases has recently gained in significance, due to the rapid proliferation of machine learning methods (including LLMs) which produce embeddings in the form of tensors. To support emerging use cases hybridizing machine learning with knowledge graphs, a robust and efficient tensor representation scheme is needed. We introduce a novel approach for representing data tensors as literals in RDF, along with an extension of SPARQL implementing specialized functionalities for handling such literals. The extension includes 36 SPARQL functions and four aggregates. To support this approach, we provide a thoroughly tested, open-source implementation based on Apache Jena, along with an exemplary knowledge graph and query set.
Problem

Research questions and friction points this paper is trying to address.

Representing data tensors as RDF literals
Extending SPARQL for tensor functionalities
Supporting machine learning with knowledge graphs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Represent tensors as RDF literals
Extend SPARQL with tensor functions
Open-source Jena-based implementation
🔎 Similar Papers
P
Piotr Marciniak
Warsaw University of Technology, pl. Politechniki 1, 00-66 1 Warsaw, Poland
P
Piotr Sowinski
Warsaw University of Technology, pl. Politechniki 1, 00-66 1 Warsaw, Poland; NeverBlink, ul. Wspólna 56, 00-684 Warsaw, Poland
Maria Ganzha
Maria Ganzha
Associate Professor Warsaw University of Technology
Agent-based computingMultiagent systemdistributed systemOntologySemantic Data Processing