๐ค AI Summary
Current research on AI-generated video detection is hindered by limited-scale datasets, outdated generative models, insufficient semantic diversity, and the absence of a systematic benchmark. To address these limitations, this work introduces AIGVDBenchโthe first large-scale, high-quality, and technologically representative benchmark, encompassing 31 state-of-the-art generative models and over 440,000 videos. The authors conduct more than 1,500 evaluations across 33 detectors spanning four major categories, employing multidimensional metrics and eight in-depth analyses. This comprehensive study reveals four novel findings, identifies critical performance bottlenecks and generalization patterns, and publicly releases all data and code to foster systematic progress in the field.
๐ Abstract
Recent advances in generative modeling can create remarkably realistic synthetic videos, making it increasingly difficult for humans to distinguish them from real ones and necessitating reliable detection methods. However, two key limitations hinder the development of this field. \textbf{From the dataset perspective}, existing datasets are often limited in scale and constructed using outdated or narrowly scoped generative models, making it difficult to capture the diversity and rapid evolution of modern generative techniques. Moreover, the dataset construction process frequently prioritizes quantity over quality, neglecting essential aspects such as semantic diversity, scenario coverage, and technological representativeness. \textbf{From the benchmark perspective}, current benchmarks largely remain at the stage of dataset creation, leaving many fundamental issues and in-depth analysis yet to be systematically explored. Addressing this gap, we propose AIGVDBench, a benchmark designed to be comprehensive and representative, covering \textbf{31} state-of-the-art generation models and over \textbf{440,000} videos. By executing more than \textbf{1,500} evaluations on \textbf{33} existing detectors belonging to four distinct categories. This work presents \textbf{8 in-depth analyses} from multiple perspectives and identifies \textbf{4 novel findings} that offer valuable insights for future research. We hope this work provides a solid foundation for advancing the field of AI-generated video detection. Our benchmark is open-sourced at https://github.com/LongMa-2025/AIGVDBench.