🤖 AI Summary
Large language models (LLMs) deployed in Open Radio Access Networks (O-RAN) face challenges in factual consistency and multi-hop reasoning during autonomous optimization. Method: This work presents the first systematic comparative evaluation—within the O-RAN context—of three retrieval-augmented generation (RAG) paradigms: vector-based RAG, graph-based RAG, and hybrid graph RAG. A structured retrieval dataset is constructed from official O-RAN specifications, and performance is quantitatively assessed using metrics including faithfulness, answer-context relevance, and factual correctness. A knowledge-graph–driven multi-hop retrieval mechanism is proposed to enhance complex-task reasoning and knowledge alignment accuracy. Results: Graph RAG improves context relevance by 7%; hybrid graph RAG increases factual correctness by 8%. Both outperform conventional vector RAG. The study establishes a reproducible evaluation framework and actionable technical pathway for RAG deployment in telecommunications.
📝 Abstract
Generative AI (GenAI) is expected to play a pivotal role in enabling autonomous optimization in future wireless networks. Within the ORAN architecture, Large Language Models (LLMs) can be specialized to generate xApps and rApps by leveraging specifications and API definitions from the RAN Intelligent Controller (RIC) platform. However, fine-tuning base LLMs for telecom-specific tasks remains expensive and resource-intensive. Retrieval-Augmented Generation (RAG) offers a practical alternative through in-context learning, enabling domain adaptation without full retraining. While traditional RAG systems rely on vector-based retrieval, emerging variants such as GraphRAG and Hybrid GraphRAG incorporate knowledge graphs or dual retrieval strategies to support multi-hop reasoning and improve factual grounding. Despite their promise, these methods lack systematic, metric-driven evaluations, particularly in high-stakes domains such as ORAN. In this study, we conduct a comparative evaluation of Vector RAG, GraphRAG, and Hybrid GraphRAG using ORAN specifications. We assess performance across varying question complexities using established generation metrics: faithfulness, answer relevance, context relevance, and factual correctness. Results show that both GraphRAG and Hybrid GraphRAG outperform traditional RAG. Hybrid GraphRAG improves factual correctness by 8%, while GraphRAG improves context relevance by 7%.