Towards Operational Data Analytics Chatbots -- Virtual Knowledge Graph is All You Need

📅 2025-06-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of querying NoSQL telemetry data—namely, query complexity, implicit relational semantics, and poor real-time performance—in Datacenter Operations Data Analytics (ODA), this paper proposes an end-to-end chatbot system that synergistically integrates a lightweight Virtual Knowledge Graph (VKG) with Large Language Models (LLMs). The system bypasses costly physical knowledge graph construction by dynamically mapping NoSQL schemas into queryable VKGs and leverages LLMs to accurately translate natural language questions into SPARQL queries. Its key innovation lies in a VKG-LLM co-reasoning mechanism, which drastically reduces graph maintenance overhead and query latency. Experiments demonstrate a substantial improvement in query accuracy—from 25% to 92.5%—and an 85% reduction in average latency (from 20.36 s to 3.03 s). The VKG consumes only 179 MiB of memory, enabling real-time deployment and interactive multi-variable time-series analysis.

Technology Category

Application Category

📝 Abstract
With generative artificial intelligence challenging computational scientific computing, data centers are experiencing unprecedented growth in both scale and volume. As a result, computing efficiency has become more critical than ever. Operational Data Analytics (ODA) relies on the collection of data center telemetry to improve efficiency, but so far has been focusing on real-time telemetry data visualization and post-mortem analysis. However, with NoSQL databases now serving as the default storage backend to support scalability, querying this data is challenging due to its schema-less nature, which requires domain knowledge to traverse relationships between data sources. Ontologies and Knowledge Graphs (KGs) can capture these relationships, but traditional KGs are costly to scale and have not been widely applied to multivariate timeseries. Virtual Knowledge Graphs (VKGs) offer a lightweight alternative by generating query-specific graphs at runtime. In this work, we present a full end-to-end ODA chatbot system that uses a Large Language Model (LLM) to generate SPARQL queries, utilizing VKG for data retrieval. This approach achieves 92.5% accuracy compared to 25% with direct NoSQL queries. The proposed methodology optimizes VKG construction and LLM inference, cutting previous work average query latency by 85% (from 20.36s to 3.03s) and keeping VKG sizes under 179 MiB. This performance makes the tool suitable for deployment and real-time interaction with ODA end-users.
Problem

Research questions and friction points this paper is trying to address.

Challenges in querying schema-less NoSQL data centers efficiently
Need for scalable knowledge graphs for multivariate timeseries data
Improving real-time interaction in Operational Data Analytics chatbots
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Virtual Knowledge Graphs for lightweight data retrieval
Leverages LLM for SPARQL query generation
Optimizes VKG construction and LLM inference
🔎 Similar Papers
No similar papers found.
J
Junaid Ahmed Khan
DEI Department, University of Bologna, Bologna, Italy
H
Hiari Pizzini Cavagna
DEI Department, University of Bologna, Bologna, Italy
A
Andrea Proia
DEI Department, University of Bologna, Bologna, Italy
Andrea Bartolini
Andrea Bartolini
Associate Professor, University of Bologna
Energy managementThermal managementNear-Threshold ComputingHigh Performance Computing