🤖 AI Summary
This work addresses key challenges in RAG-based QA systems—low development efficiency, poor reproducibility, and suboptimal performance—by proposing an end-to-end, modular, high-performance RAG development framework. The framework unifies critical components including data preprocessing, retrieval-augmented generation (RAG), co-optimization of embedding and large language models (LLMs), automated fine-tuning data synthesis, and automated evaluation, enabling a full closed-loop pipeline from data preparation to local deployment. Its core innovations include an embedding-generation joint tuning mechanism and a reproducible, standardized evaluation pipeline. Empirical evaluation on authoritative QA benchmarks—including Natural Questions (NQ), TriviaQA, and HotpotQA—demonstrates substantial improvements over strong baselines, achieving state-of-the-art (SOTA) performance while simultaneously enhancing development efficiency and system robustness.
📝 Abstract
We introduce AccurateRAG -- a novel framework for constructing high-performance question-answering applications based on retrieval-augmented generation (RAG). Our framework offers a pipeline for development efficiency with tools for raw dataset processing, fine-tuning data generation, text embedding&LLM fine-tuning, output evaluation, and building RAG systems locally. Experimental results show that our framework outperforms previous strong baselines and obtains new state-of-the-art question-answering performance on benchmark datasets.