Latent Video Dataset Distillation

📅 2025-04-23

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

Existing video data distillation methods operate solely in pixel space and neglect the latent-space representations leveraged by text-to-video models, resulting in suboptimal compression efficiency and semantic fidelity. This work pioneers latent-space data distillation for videos, introducing the first end-to-end latent video distillation framework: (1) compact and semantically rich video latent representations are extracted via a variational autoencoder (VAE); (2) a diversity-aware sample selection strategy is designed to preserve representational coverage; and (3) a training-free, gradient-free latent data recompression method is developed to further reduce latent code size without performance degradation. Evaluated on HMDB51 (IPC 1) and MiniUCF (IPC 5), our approach achieves absolute accuracy gains of +2.6% and +7.8%, respectively, establishing new state-of-the-art performance in video data distillation across all standard benchmarks.

Technology Category

Application Category

📝 Abstract

Dataset distillation has demonstrated remarkable effectiveness in high-compression scenarios for image datasets. While video datasets inherently contain greater redundancy, existing video dataset distillation methods primarily focus on compression in the pixel space, overlooking advances in the latent space that have been widely adopted in modern text-to-image and text-to-video models. In this work, we bridge this gap by introducing a novel video dataset distillation approach that operates in the latent space using a state-of-the-art variational encoder. Furthermore, we employ a diversity-aware data selection strategy to select both representative and diverse samples. Additionally, we introduce a simple, training-free method to further compress the distilled latent dataset. By combining these techniques, our approach achieves a new state-of-the-art performance in dataset distillation, outperforming prior methods on all datasets, e.g. on HMDB51 IPC 1, we achieve a 2.6% performance increase; on MiniUCF IPC 5, we achieve a 7.8% performance increase.

Problem

Research questions and friction points this paper is trying to address.

Distilling video datasets in latent space for efficiency

Enhancing sample diversity with a selection strategy

Compressing distilled datasets without additional training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Latent space video dataset distillation approach

Diversity-aware data selection strategy

Training-free latent dataset compression method

🔎 Similar Papers

VideoPrism: A Foundational Visual Encoder for Video Understanding