🤖 AI Summary
This study addresses the degradation of multilingual representation capability in dual-encoder retrieval systems when fine-tuning only the query encoder on English data. We propose “adiabatic fine-tuning”—a novel paradigm that applies supervised fine-tuning at an extremely low learning rate to preserve and enhance the model’s inherent cross-lingual representational capacity. Our method leverages high-quality multilingual embedding models, couples low-learning-rate optimization with cross-lingual quality evaluation, and requires no translation or additional multilingual supervision. Experiments demonstrate that, after English-only fine-tuning, our approach maintains original non-English retrieval performance while achieving an average +1.2% MRR gain on multilingual benchmarks (e.g., MIRACL, XQuAD) and heterogeneous-domain data—validating effective implicit knowledge transfer. The core contribution is the first adaptation of the adiabatic principle to multilingual representation fine-tuning, offering a scalable, translation-free, and lightweight optimization strategy for cross-lingual retrieval under resource constraints.
📝 Abstract
A query encoder of a dual passage retrieval system can be tuned for specific types of queries or domains, while the precomputed and stored documents representations are kept intact. Switching from one query encoder to another when needed is easily feasible, unlike overhauling the embeddings of a whole knowledge base. In this work we raise a question: Can the generic, original qualities of the encoder be preserved or at least left not too degraded when it is tuned on a narrow domain? We conducted experiments on a high quality multilingual embedding model: Tuning it on a single English-only dataset, we observe that the tuning not only preserves the multilingual qualities, but even improves them. The embedding qualities on distinctly different data are also improved or at least preserved. Drawing on our observations, we suggest a more general hypothesis: Tuning with intentionally low learning rate can preserve or improve a system's properties acquired in training, but not specifically targeted by tuning. We call this adiabatic tuning and provide tentative explanations.