Efficient Constant-Space Multi-Vector Retrieval

📅 2025-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multi-vector retrieval models (e.g., ColBERT) balance effectiveness and latency but incur high storage overhead and poor OS paging efficiency due to storing one vector per token. This work proposes the **Fixed-Slot Multi-Vector Encoding Paradigm**, the first approach to decouple document representation from input tokenization: a learnable encoder maps documents of arbitrary length into a fixed-size set of slot vectors, with joint optimization of encoding and ranking modules. Evaluated on MSMARCO and BEIR benchmarks, our method retains over 98% of the original retrieval effectiveness while significantly reducing storage footprint. It also improves disk I/O throughput and cache efficiency, thereby achieving a favorable trade-off among retrieval accuracy, storage economy, and system compatibility.

Technology Category

Application Category

📝 Abstract
Multi-vector retrieval methods, exemplified by the ColBERT architecture, have shown substantial promise for retrieval by providing strong trade-offs in terms of retrieval latency and effectiveness. However, they come at a high cost in terms of storage since a (potentially compressed) vector needs to be stored for every token in the input collection. To overcome this issue, we propose encoding documents to a fixed number of vectors, which are no longer necessarily tied to the input tokens. Beyond reducing the storage costs, our approach has the advantage that document representations become of a fixed size on disk, allowing for better OS paging management. Through experiments using the MSMARCO passage corpus and BEIR with the ColBERT-v2 architecture, a representative multi-vector ranking model architecture, we find that passages can be effectively encoded into a fixed number of vectors while retaining most of the original effectiveness.
Problem

Research questions and friction points this paper is trying to address.

Reducing storage costs in multi-vector retrieval methods
Encoding documents to fixed-size vector representations
Maintaining retrieval effectiveness with constant-space vectors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fixed number of vectors encoding
Reduced storage costs significantly
Retained original retrieval effectiveness
🔎 Similar Papers
2024-01-16arXiv.orgCitations: 76