🤖 AI Summary
This work addresses the inefficiency of multilingual retrieval-augmented generation (RAG) systems caused by redundant context, a challenge exacerbated by the limited cross-lingual generalization of existing pruning methods. To this end, we propose XProvence, the first zero-cost multilingual context pruning approach that seamlessly integrates pruning capability directly into the reranker without incurring additional computational overhead. Building upon the Provence framework, XProvence leverages multilingual pretraining and cross-lingual transfer to support over 100 languages. Extensive experiments on four multilingual question answering benchmarks demonstrate that XProvence achieves substantial context compression with negligible performance degradation, significantly outperforming strong baselines.
📝 Abstract
This paper introduces XProvence, a multilingual zero-cost context pruning model for retrieval-augmented generation (RAG), trained on 16 languages and supporting 100+ languages through effective cross-lingual transfer. Motivated by the growing use of RAG systems across diverse languages, we explore several strategies to generalize the Provence framework-which first integrated efficient zero-cost context pruning directly into the re-ranking model-beyond English. Across four multilingual question answering benchmarks, we show how XProvence can prune RAG contexts with minimal-to-no performance degradation and outperforms strong baselines. Our model is available at https://huggingface.co/naver/xprovence-reranker-bgem3-v2.