Dual-Route Top-K Retrieval with 1v1 VLM Reranking for the CoVR-R

📅 2026-05-31
📈 Citations: 0
✨ Influential: 0
📄 PDF

career value

185K/year
🤖 AI Summary
This work addresses the challenges of incomplete candidate recall and unstable ranking in compositional video retrieval by proposing a decoupled dual-path architecture that separates recall and reranking. The approach first generates a Top-10 candidate set through parallel text and visual pathways, followed by a conservative pairwise reranking mechanism based on a vision-language model (VLM). This strategy avoids direct multi-candidate VLM classification or large-scale textual reranking, instead leveraging a VLM-based slot selector, DFN-H/DFN-L contact map embeddings, and a dual-path fusion scheme to significantly enhance ranking stability and accuracy. Evaluated on a hidden test set, the method achieves state-of-the-art performance with R@1 of 95.28, R@5 of 97.47, R@10 of 98.48, and R@50 of 99.66.
📝 Abstract
We describe \emph{Dual-Route Top-K Retrieval with 1v1 VLM Reranking} for the CoVR-R challenge. The method treats composed video retrieval as two coupled problems: finding a sufficiently complete top-k candidate set, and then safely deciding whether any candidate should replace a strong current top-1. We first improve the reasoning/text seed with a VLM slot selector over existing candidates, without introducing DFN visual retrieval. We then add a visual route from contact-sheet embeddings using DFN-H/DFN-L. The routes are merged into a top-10 candidate set, after which a VLM final reranker performs conservative 1v1 comparisons between the current top-1 and each challenger. On the hidden test split, the final system reaches 95.28 R@1, 97.47 R@5, 98.48 R@10, and 99.66 R@50. The main lesson is that CoVR-R benefits more from recall-selection decoupling than from broad text reranking or direct multi-candidate VLM classification.
Problem

Research questions and friction points this paper is trying to address.

composed video retrieval
top-k retrieval
video-language model
candidate reranking
CoVR-R
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-Route Retrieval
1v1 VLM Reranking
CoVR-R
Recall-Selection Decoupling
Contact-Sheet Embeddings
🔎 Similar Papers