UnWeaving the knots of GraphRAG -- turns out VectorRAG is almost enough

📅 2026-02-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Traditional RAG systems treat text chunks as atomic units, limiting their capacity for multi-hop question answering. While GraphRAG incorporates knowledge graphs to model relational structures, it suffers from exponential computational complexity and reliance on heuristic retrieval strategies. This work proposes UnWeaver, a novel framework that eschews explicit graph construction by leveraging large language models to decompose documents into entities spanning multiple text chunks. UnWeaver introduces an entity-to-chunk mapping mechanism that enables entity-mediated reconstruction of original content during retrieval. By using entities as intermediaries, the approach preserves high fidelity to source materials while effectively supporting multi-hop reasoning. The method significantly reduces system complexity and noise, achieving performance comparable to GraphRAG with a substantially simpler and more efficient architecture.
📝 Abstract
One of the key problems in Retrieval-augmented generation (RAG) systems is that chunk-based retrieval pipelines represent the source chunks as atomic objects, mixing the information contained within such a chunk into a single vector. These vector representations are then fundamentally treated as isolated, independent and self-sufficient, with no attempt to represent possible relations between them. Such an approach has no dedicated mechanisms for handling multi-hop questions. Graph-based RAG systems aimed to ameliorate this problem by modeling information as knowledge-graphs, with entities represented by nodes being connected by robust relations, and forming hierarchical communities. This approach however suffers from its own issues with some of them being: orders of magnitude increased componential complexity in order to create graph-based indices, and reliance on heuristics for performing retrieval. We propose UnWeaver, a novel RAG framework simplifying the idea of GraphRAG. UnWeaver disentangles the contents of the documents into entities which can occur across multiple chunks using an LLM. In the retrieval process entities are used as an intermediate way of recovering original text chunks hence preserving fidelity to the source material. We argue that entity-based decomposition yields a more distilled representation of original information, and additionally serves to reduce noise in the indexing, and generation process.
Problem

Research questions and friction points this paper is trying to address.

Retrieval-augmented generation
multi-hop reasoning
chunk-based retrieval
knowledge graph
entity decomposition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Entity-based decomposition
Retrieval-augmented generation
GraphRAG simplification
LLM-driven disentanglement
Multi-hop reasoning
🔎 Similar Papers
No similar papers found.
R
Ryszard Tuora
Samsung AI Center Warsaw
M
Mateusz Galiński
Samsung AI Center Warsaw
M
Michał Godziszewski
Samsung AI Center Warsaw
M
Michał Karpowicz
Samsung AI Center Warsaw
Mateusz Czyżnikiewicz
Mateusz Czyżnikiewicz
-
artificial intelligencemachine learningspeech processingnatural language processing
A
Adam Kozakiewicz
Samsung AI Center Warsaw
Tomasz Ziętkiewicz
Tomasz Ziętkiewicz
PhD student, Adam Mickiewicz University, Poznan, Poland
Natural language processingspeech processingtext normalizationmachine learning