π€ AI Summary
This work addresses the coding efficiency limitation in Versatile Video Coding (VVC) caused by the high bit overhead of geometric partitioning (GEO) side information. To mitigate this issue, the authors propose a GEO mode prediction and encoding optimization method leveraging spatiotemporal correlations. Specifically, high-probability symmetric triangular GEO (STGEO) modes are predicted using edge features and historical partition patterns from neighboring blocks, and an adaptive merge candidate list is constructed based on offline-trained candidate selection probabilities. Additionally, entropy coding is refined to reduce the bitrate required for signaling GEO-related side information. Experimental results demonstrate that, compared to VTM-8.0 without GEO enabled, the proposed approach achieves average bitrate savings of 0.95% under Random Access configuration and 1.98% under Low-Delay B configuration.
π Abstract
Geometric partitioning has attracted increasing attention by its remarkable motion field description capability in the hybrid video coding framework. However, the existing geometric partitioning (GEO) scheme in Versatile Video Coding (VVC) causes a non-negligible burden for signaling the side information. Consequently, the coding efficiency is limited. In view of this, we propose a spatio-temporal correlation guided geometric partitioning (STGEO) scheme to efficiently describe the object information in the motion field of video coding. The proposed method can economize the bits consumed for side information signaling, including the partitioning mode and motion information. We firstly analyze the characteristics of partitioning mode decision and motion vector selection in a statistically-sound way. Based on the observed spatio-temporal correlation, we design a mode prediction and coding method to reduce the overhead for representing the above mentioned side information. The main idea is to predict the STGEO modes and motion candidates that have higher selection possibilities, which can guide the entropy coding, i.e., representing the predicted high-probability modes and motion candidates with fewer bits. In particular, the high-probability STGEO modes are predicted based on the edge information and history modes of adjacent STGEO-coded blocks. The corresponding motion information is represented by the index in a merge candidate list, which is adaptively inferred based on the off-line trained merge candidate selection probability. Simulation results show that the proposed approach achieves 0.95% and 1.98% bit-rate savings on average compared to VTM-8.0 without GEO for Random Access and Low-Delay B configurations, respectively.