🤖 AI Summary
To address the low sub-pixel motion compensation accuracy, high computational overhead, and inferior compression performance of learned video codecs relative to HEVC/VVC, this paper proposes three synergistic optimizations: (1) replacing bilinear interpolation with a learnable high-order interpolation filter; (2) parameterizing motion information at the block level to reduce motion field redundancy; and (3) introducing a finite-precision motion vector modeling mechanism to minimize quantization error while preserving compensation accuracy. Evaluated within the COOL-CHIC framework, the proposed method achieves an average BD-rate reduction of 10.2% and reduces motion-compensation-related decoding computation from 391 to 214 MACs per pixel—a 45.3% decrease—significantly narrowing the performance gap with conventional codecs. The implementation is publicly available.
📝 Abstract
Motion compensation is a key component of video codecs. Conventional codecs (HEVC and VVC) have carefully refined this coding step, with an important focus on sub-pixel motion compensation. On the other hand, learned codecs achieve sub-pixel motion compensation through simple bilinear filtering. This paper offers to improve learned codec motion compensation by drawing inspiration from conventional codecs. It is shown that the usage of more advanced interpolation filters, block-based motion information and finite motion accuracy lead to better compression performance and lower decoding complexity. Experimental results are provided on the Cool-chic video codec, where we demonstrate a rate decrease of more than 10% and a lowering of motion-related decoding complexity from 391 MAC per pixel to 214 MAC per pixel. All contributions are made open-source at https://github.com/Orange-OpenSource/Cool-Chic