VQ-Jarvis: Retrieval-Augmented Video Restoration Agent with Sharp Vision and Fast Thought

📅 2026-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing video restoration methods in handling the complex and diverse degradation types encountered in real-world scenarios, particularly their trade-offs between perceptual quality and search efficiency. The authors propose a unified video restoration agent that dynamically selects the optimal restoration pathway through degradation-aware analysis and feedback from intermediate restoration results. Key contributions include the construction of VSR-Compare, the first large-scale paired enhancement comparison dataset; an adaptive restoration strategy integrating retrieval-augmented generation (RAG) with hierarchical greedy operator scheduling; and a multi-operator evaluation mechanism. Extensive experiments demonstrate that the proposed method significantly outperforms state-of-the-art approaches across various challenging degradation settings, achieving both high-quality restoration and efficient inference.

Technology Category

Application Category

📝 Abstract
Video restoration in real-world scenarios is challenged by heterogeneous degradations, where static architectures and fixed inference pipelines often fail to generalize. Recent agent-based approaches offer dynamic decision making, yet existing video restoration agents remain limited by insufficient quality perception and inefficient search strategies. We propose VQ-Jarvis, a retrieval-augmented, all-in-one intelligent video restoration agent with sharper vision and faster thought. VQ-Jarvis is designed to accurately perceive degradations and subtle differences among paired restoration results, while efficiently discovering optimal restoration trajectories. To enable sharp vision, we construct VSR-Compare, the first large-scale video paired enhancement dataset with 20K comparison pairs covering 7 degradation types, 11 enhancement operators, and diverse content domains. Based on this dataset, we train a multiple operator judge model and a degradation perception model to guide agent decisions. To achieve fast thought, we introduce a hierarchical operator scheduling strategy that adapts to video difficulty: for easy cases, optimal restoration trajectories are retrieved in a one-step manner from a retrieval-augmented generation (RAG) library; for harder cases, a step-by-step greedy search is performed to balance efficiency and accuracy. Extensive experiments demonstrate that VQ-Jarvis consistently outperforms existing methods on complex degraded videos.
Problem

Research questions and friction points this paper is trying to address.

video restoration
heterogeneous degradations
quality perception
search strategy
real-world scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

retrieval-augmented generation
video restoration agent
degradation perception
hierarchical operator scheduling
VSR-Compare dataset
🔎 Similar Papers
No similar papers found.