LUK: Empowering Log Understanding with Expert Knowledge from Large Language Models

πŸ“… 2024-09-03
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the contradiction between scarce domain expertise in log understanding, high inference costs of large language models (LLMs), and insufficient knowledge in lightweight models, this paper proposes LUK: a Log Understanding framework leveraging Knowledge distillation. LUK employs multi-role LLMs to collaboratively distill operational and semantic expertise, then injects this knowledge into a lightweight BERT-style PLM via knowledge-augmented fine-tuning and two log-specific pretraining tasksβ€”masked log reconstruction and anomaly discrimination. It introduces the first multi-expert collaborative knowledge distillation mechanism, effectively unifying LLM-level knowledge capability with PLM-level inference efficiency. Extensive experiments demonstrate that LUK achieves state-of-the-art performance on log-based anomaly detection and fault classification, significantly outperforming both vanilla PLM baselines and direct LLM invocation baselines, while attaining an optimal trade-off between accuracy and latency.

Technology Category

Application Category

πŸ“ Abstract
Logs play a critical role in providing essential information for system monitoring and troubleshooting. Recently, with the success of pre-trained language models (PLMs) and large language models (LLMs) in natural language processing (NLP), smaller PLMs (such as BERT) and LLMs (like GPT-4) have become the current mainstream approaches for log analysis. Despite the remarkable capabilities of LLMs, their higher cost and inefficient inference present significant challenges in leveraging the full potential of LLMs to analyze logs. In contrast, smaller PLMs can be fine-tuned for specific tasks even with limited computational resources, making them more practical. However, these smaller PLMs face challenges in understanding logs comprehensively due to their limited expert knowledge. To address the lack of expert knowledge and enhance log understanding for smaller PLMs, this paper introduces a novel and practical knowledge enhancement framework, called LUK, which acquires expert knowledge from LLMs automatically and then enhances the smaller PLM for log analysis with these expert knowledge. LUK can take full advantage of both types of models. Specifically, we design a multi-expert collaboration framework based on LLMs with different roles to acquire expert knowledge. In addition, we propose two novel pre-training tasks to enhance the log pre-training with expert knowledge. LUK achieves state-of-the-art results on different log analysis tasks and extensive experiments demonstrate expert knowledge from LLMs can be utilized more effectively to understand logs. Our source code and detailed experimental data are available at https://github.com/LeaperOvO/LUK.
Problem

Research questions and friction points this paper is trying to address.

Large Language Models
Small Pre-trained Models
Log Understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Distillation
Multi-expert Collaboration
Log Analysis Enhancement
πŸ”Ž Similar Papers
No similar papers found.
Lipeng Ma
Lipeng Ma
Fudan University
Weidong Yang
Weidong Yang
Professor of Computer Science
Big Data
Sihang Jiang
Sihang Jiang
Fudan University
Knowledge GraphLarge Language Models
B
Ben Fei
School of Computer Science, Fudan University, Shanghai, China, 200433
M
Mingjie Zhou
School of Computer Science, Fudan University, Shanghai, China, 200433
Shuhao Li
Shuhao Li
Fudan University
B
Bo Xu
School of Computer Science and Technology, Donghua University, Shanghai, China, 201620
Y
Yanghua Xiao
School of Computer Science, Fudan University, Shanghai, China, 200433