🤖 AI Summary
This study addresses the performance bottleneck in retrieval-augmented generation (RAG) for Ukrainian multi-domain document understanding, which stems from inaccurate document and page localization. In the UNLP 2026 shared task, the authors propose an agent-based RAG system that integrates a two-stage retrieval pipeline—initial retrieval with BGE-M3 followed by BGE-based reranking—and a lightweight agent layer. This work introduces, for the first time in Ukrainian RAG, query rewriting and answer retry mechanisms orchestrated by the agent. Experimental results demonstrate that the agent-driven retry strategy effectively improves answer accuracy; however, overall performance remains constrained by retrieval precision, underscoring the critical role of high-quality retrieval. The findings point toward promising directions for combining more powerful retrieval models with advanced agent-based reasoning in low-resource languages.
📝 Abstract
We present an initial investigation into Agentic Retrieval-Augmented Generation (RAG) for Ukrainian, conducted within the UNLP 2026 Shared Task on Multi-Domain Document Understanding. Our system combines two-stage retrieval (BGE-M3 with BGE reranking) with a lightweight agentic layer performing query rephrasing and answer-retry loops on top of Qwen2.5-3B-Instruct. Our analysis reveals that retrieval quality is the primary bottleneck: agentic retry mechanisms improve answer accuracy but the overall score remains constrained by document and page identification. We discuss practical limitations of offline agentic pipelines and outline directions for combining stronger retrieval with more advanced agentic reasoning for Ukrainian.