🤖 AI Summary
This work addresses the performance degradation in conversational search caused by distribution shift between standalone and dialogue-based queries. The authors propose a novel approach that implicitly incorporates query rewriting capabilities into a dense retrieval model without requiring explicit query reformulation or costly relevance annotations. Leveraging knowledge distillation, the method aligns the embeddings of original conversational queries with those of rewritten queries generated by a large language model, thereby enabling context-aware retrieval. Notably, this is the first approach to seamlessly integrate rewriting ability directly into the embedding space, allowing unified handling of both standalone and conversational queries without reindexing. Experimental results demonstrate significant improvements over strong baselines on QReCC, TopiOCQA, and TREC CAsT, with up to a 20% gain in Recall@10 under distribution shift scenarios.
📝 Abstract
Conversational search has become increasingly important in retrieval-augmented generation (RAG) systems, where users interact with AI assistants through multi-turn conversations containing context-dependent queries. We propose RCEM, a conversational dense retrieval model that distills the query reformulation capability of LLMs into the embedding model, enabling context-aware retrieval without explicit query rewriting during inference. Unlike prior conversational dense retrieval approaches that learn direct conversation-to-document matching, RCEM aligns conversational-query embeddings with rewritten-query embeddings, improving robustness under distributional shift. RCEM does not require conversational query-to-document relevance mappings for training, which are often expensive and difficult to obtain with high quality. Extensive experiments on QReCC, TopiOCQA, and TREC CAsT demonstrate that RCEM consistently outperforms strong conversational retrieval baselines, achieving particularly large gains under distributional shift, including up to 20% improvement in Recall@10. RCEM further extends the base embedding model with conversational query rewriting capability while preserving its original retrieval functionality, allowing both standalone and conversational queries to be encoded by a single model and searched against existing document indexes without rebuilding the retrieval database.