SiriusBI: Building End-to-End Business Intelligence Enhanced by Large Language Models

📅 2024-11-09
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Contemporary BI systems face three critical challenges in the LLM era: legacy functionalities inadequately support dynamic analytical requirements; single-turn SQL generation impedes iterative, clarifying multi-turn interactions; and cross-domain adaptation incurs prohibitively high customization costs. To address these, we propose SiriusBI—the first end-to-end, LLM-augmented BI system designed for industrial deployment. It introduces an application-oriented multi-turn dialogue query module, a dual-path SQL generation mechanism balancing accuracy and deployment efficiency, and a full-stack BI orchestration engine spanning data preparation, analysis, and visualization. Built on a cloud-native microservice architecture, SiriusBI has been deployed across Tencent’s financial, advertising, and cloud business units, achieving SQL accuracy rates of 97%, 89%, and 91%, respectively. As a standalone cloud service, it supports dozens of external enterprises, demonstrating industrial-grade practicality, robustness, and scalability.

Technology Category

Application Category

📝 Abstract
The rapid advancement of AI technologies, particularly Large Language Models (LLMs), is establishing a new paradigm for Business Intelligence (BI). Despite the emergence of pioneering work in enhancing BI systems with LLMs, we have identified the following three issues when deployed in real industrial scenarios: interaction limitations, performance bottlenecks, and functionality deficiencies. In this paper, we present SiriusBI, an end-to-end business intelligence system that is designed to address the three issues simultaneously. First, we propose an intelligent and application-oriented module called multi-round dialogue with querying, which aims to overcome the prevalent interaction limitations in current BI solutions. Next, to mitigate the performance bottlenecks caused by scenario migration, we introduce two SQL generation methods that strike a balance between accuracy and deployment costs. Finally, to tackle the practical challenges posed by functionality deficiencies, we develop an end-to-end workflow that covers the entire BI process, ensuring that SiriusBI delivers a robust and complete set of functionalities. As an independent cloud service in Tencent's data platform, SiriusBI has been applied across Tencent's finance, advertising, and cloud sectors, providing services to dozens of enterprise clients. Experiments on real-world datasets and practical applications in industrial BI scenarios demonstrate the practicality and effectiveness of SiriusBI. Remarkably, SiriusBI achieves remarkable accuracy rates of 97% in SQL generation for Tencent Finance, 89% for Tencent Advertisement, and 91% for Tencent Cloud.
Problem

Research questions and friction points this paper is trying to address.

Legacy BI systems lack functionality for LLM-era demands
Single-round SQL generation fails in multi-round clarification
High costs hinder cross-domain adaptation in BI solutions
Innovation

Methods, ideas, or system contributions that make the work stand out.

End-to-end architecture with multi-module coordination
Multi-round dialogue for SQL generation clarification
Data-conditioned SQL generation method selection strategy
🔎 Similar Papers
No similar papers found.
J
Jie Jiang
Department of Data Platform, TEG, Tencent Inc., China
H
Haining Xie
Department of Data Platform, TEG, Tencent Inc., China
Y
Yu Shen
Department of Data Platform, TEG, Tencent Inc., China
Z
Zihan Zhang
Department of Data Platform, TEG, Tencent Inc., China
M
Meng Lei
Department of Data Platform, TEG, Tencent Inc., China
Yifeng Zheng
Yifeng Zheng
The Hong Kong Polytechnic University
Privacy-aware ComputingSecure Networked SystemsAI Security and Privacy
Y
Yide Fang
Department of Data Platform, TEG, Tencent Inc., China
C
Chunyou Li
Department of Data Platform, TEG, Tencent Inc., China
Danqing Huang
Danqing Huang
Microsoft
Natural Language ProcessingDesign Intelligence
Wentao Zhang
Wentao Zhang
Institute of Physics, Chinese Academy of Sciences
photoemissionsuperconductivitycupratehtsctime-resolved
Y
Yang Li
Department of Data Platform, TEG, Tencent Inc., China
X
Xiaofeng Yang
Department of Data Platform, TEG, Tencent Inc., China
B
Bin Cui
School of Computer Science, Peking University, China
P
Peng Chen
Department of Data Platform, TEG, Tencent Inc., China