🤖 AI Summary
To address high query latency, poor scalability, and weak multi-source coordination in interactive exploration of large-scale heterogeneous knowledge graphs, this paper proposes a unified federated query system supporting document, relational, and graph data models. Our approach introduces three key contributions: (1) a semi-join decomposition technique that effectively curbs exponential blowup of intermediate results in path queries; (2) the first integration of query plan folding, semantic caching, and adaptive query planning—collectively enhancing execution efficiency and resource utilization; and (3) a novel distributed join algorithm enabling cross-modal collaborative analysis. Experimental evaluation demonstrates near-linear scalability across three dimensions—data volume, concurrent user count, and number of compute nodes—while sustaining sub-100ms end-to-end query latency.
📝 Abstract
Investigative workflows require interactive exploratory analysis on large heterogeneous knowledge graphs. Current databases show limitations in enabling such task. This paper discusses the architecture of Siren Federate, a system that efficiently supports exploratory graph analysis by bridging document-oriented, relational and graph models. Technical contributions include distributed join algorithms, adaptive query planning, query plan folding, semantic caching, and semi-join decomposition for path query. Semi-join decomposition addresses the exponential growth of intermediate results in path-based queries. Experiments show that Siren Federate exhibits low latency and scales well with the amount of data, the number of users, and the number of computing nodes.