π€ AI Summary
This work addresses the high computational complexity of existing neural video codecs, which hinders their practical deployment. The authors propose DCVC-UF, a block-based joint coding framework that co-encodes multiple frames into a unified latent representation and enables efficient spatiotemporal modeling and synchronized reconstruction through a cross-frame interaction module and frame-specific parallel decoders. DCVC-UF introduces a novel block-level joint coding paradigm that integrates bitstream interaction directly into a single-step entropy coding process, substantially improving throughput and enhancing long-term temporal modeling capability. Experimental results demonstrate that DCVC-UF achieves superior rate-distortion performance while significantly outperforming state-of-the-art neural video codecs in encoding and decoding speed.
π Abstract
While neural video codecs (NVCs) have demonstrated superior compression ratio, their prohibitive computational complexity remains a critical barrier to real-world deployment. This paper introduces a chunk-based coding framework designed to significantly improve the rate-distortion-complexity trade-off. Instead of processing frames sequentially, our approach encodes a chunk of multiple frames into a single compact latent representation and decodes them simultaneously. This is enabled by cross-frame interaction modules for joint spatial-temporal modeling and frame-specific decoders for parallel reconstruction. This paradigm not only dramatically enhances coding throughput but also facilitates more effective modeling of long-term temporal correlations. To further boost speed, we propose a streamlined entropy coding mechanism that consolidates bit-stream interactions into a single step, substantially reducing decoding overhead. Building on these innovations, we present DCVC-UF (Ultra-Fast), a new NVC that sets a new SOTA in performance. Our experiments show that DCVC-UF can achieve ultra-fast encoding and decoding speeds, significantly outperforming previous leading codecs. DCVC-UF serves as a notable landmark in the journey of NVC evolution. The code is at https://github.com/microsoft/DCVC.