🤖 AI Summary
This study addresses the lack of standardized guidelines for deploying retrieval-augmented generation (RAG) systems in industrial healthcare settings. It presents the first systematic evaluation of RAG components on medical tasks, employing a modular architecture, ablation studies, and multi-task assessment to investigate the trade-offs between performance and efficiency. The work proposes a practical, evidence-based framework of best practices and identifies an optimal component configuration that significantly improves both answer accuracy and reasoning efficiency across three representative medical tasks. These findings provide empirical support and actionable technical guidance for the real-world deployment of medical RAG systems.
📝 Abstract
While retrieval augmented generation (RAG) has been swiftly adopted in industrial applications based on large language models (LLMs), there is no consensus on what are the best practices for building a RAG system in terms of what are the components, how to organize these components and how to implement each component for the industrial applications, especially in the medical domain. In this work, we first carefully analyze each component of the RAG system and propose practical alternatives for each component. Then, we conduct systematic evaluations on three types of tasks, revealing the best practices for improving the RAG system and how LLM-based RAG systems make trade-offs between performance and efficiency.