Personalizing Large Language Models using Retrieval Augmented Generation and Knowledge Graph

📅 2025-05-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address hallucinations and temporal staleness in large language models (LLMs) caused by factual incompleteness in personalized response generation, this paper proposes a novel framework integrating retrieval-augmented generation (RAG) with a lightweight, dynamic personal knowledge graph (PK-G). It is the first to model real-time structured user data—such as calendar entries—as an incrementally updatable semantic knowledge graph and deeply embed it into the RAG pipeline. The method leverages KG-driven fine-grained fact retrieval and personalized reasoning to significantly improve response factual accuracy and temporal freshness. Experimental results demonstrate substantial gains in personalized question-answering accuracy over baseline LLMs using plain-text prompting, alongside moderate latency reduction. The approach effectively mitigates hallucinations and enhances user-perceived relevance and trustworthiness.

Technology Category

Application Category

📝 Abstract
The advent of large language models (LLMs) has allowed numerous applications, including the generation of queried responses, to be leveraged in chatbots and other conversational assistants. Being trained on a plethora of data, LLMs often undergo high levels of over-fitting, resulting in the generation of extra and incorrect data, thus causing hallucinations in output generation. One of the root causes of such problems is the lack of timely, factual, and personalized information fed to the LLM. In this paper, we propose an approach to address these problems by introducing retrieval augmented generation (RAG) using knowledge graphs (KGs) to assist the LLM in personalized response generation tailored to the users. KGs have the advantage of storing continuously updated factual information in a structured way. While our KGs can be used for a variety of frequently updated personal data, such as calendar, contact, and location data, we focus on calendar data in this paper. Our experimental results show that our approach works significantly better in understanding personal information and generating accurate responses compared to the baseline LLMs using personal data as text inputs, with a moderate reduction in response time.
Problem

Research questions and friction points this paper is trying to address.

Reducing hallucinations in LLM outputs by integrating timely, factual data
Personalizing LLM responses using retrieval augmented generation and knowledge graphs
Improving accuracy in handling frequently updated personal data like calendars
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Retrieval Augmented Generation for personalization
Integrates Knowledge Graphs for structured data
Focuses on calendar data for accurate responses
🔎 Similar Papers
No similar papers found.