AceVFI: A Comprehensive Survey of Advances in Video Frame Interpolation

📅 2025-06-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Video Frame Interpolation (VFI) aims to synthesize high-fidelity intermediate frames, with core challenges including large motions, occlusions, illumination variations, and modeling of nonlinear motion. This work presents a systematic survey of over 250 publications and introduces, for the first time, a comprehensive methodological taxonomy for VFI—explicitly distinguishing Continuous-Time Frame Interpolation (CTFI) from Arbitrary-Time Frame Interpolation (ATFI). It unifies major technical paradigms, including optical flow estimation, kernel prediction, generative adversarial networks (GANs), Transformers, Mamba architectures, and diffusion models. Furthermore, it establishes the first end-to-end research map covering datasets, loss functions, evaluation metrics, and cross-domain applications. The resulting framework constitutes the most complete knowledge structure for VFI to date, providing standardized terminology, reproducible benchmarks, and a clear technological evolution roadmap—thereby significantly advancing the systematic development of low-level vision foundational tasks.

Technology Category

Application Category

📝 Abstract
Video Frame Interpolation (VFI) is a fundamental Low-Level Vision (LLV) task that synthesizes intermediate frames between existing ones while maintaining spatial and temporal coherence. VFI techniques have evolved from classical motion compensation-based approach to deep learning-based approach, including kernel-, flow-, hybrid-, phase-, GAN-, Transformer-, Mamba-, and more recently diffusion model-based approach. We introduce AceVFI, the most comprehensive survey on VFI to date, covering over 250+ papers across these approaches. We systematically organize and describe VFI methodologies, detailing the core principles, design assumptions, and technical characteristics of each approach. We categorize the learning paradigm of VFI methods namely, Center-Time Frame Interpolation (CTFI) and Arbitrary-Time Frame Interpolation (ATFI). We analyze key challenges of VFI such as large motion, occlusion, lighting variation, and non-linear motion. In addition, we review standard datasets, loss functions, evaluation metrics. We examine applications of VFI including event-based, cartoon, medical image VFI and joint VFI with other LLV tasks. We conclude by outlining promising future research directions to support continued progress in the field. This survey aims to serve as a unified reference for both newcomers and experts seeking a deep understanding of modern VFI landscapes.
Problem

Research questions and friction points this paper is trying to address.

Surveying advances in Video Frame Interpolation techniques
Analyzing challenges like large motion and occlusion
Reviewing applications and future research directions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Comprehensive survey covering 250+ VFI papers
Systematic categorization of VFI methodologies
Analysis of key challenges and future directions
🔎 Similar Papers
No similar papers found.
D
Dahyeon Kye
Department of Imaging Science, GSAIM, Chung-Ang University, Seoul, South Korea
C
Changhyun Roh
Department of Imaging Science, GSAIM, Chung-Ang University, Seoul, South Korea
Sukhun Ko
Sukhun Ko
Chung-Ang University
Chanho Eom
Chanho Eom
Assistant Professor @Chung-Ang University
Computer VisionMachine LearningArtificial Intelligence
Jihyong Oh
Jihyong Oh
Assistant Prof. @ Chung-Ang Univ. (CAU), PhD/MS/BS @ KAIST
Computer VisionImage/Video ProcessingDeep LearningGen AI