🤖 AI Summary
The 2020s witnessed a critical shortage of high-quality, controllable test datasets for image/video coding research. Method: This paper introduces the USTC-TD benchmark—a curated dataset comprising 40 4K images and 10 1080p videos—spanning diverse scenes, textures, motion patterns, and imaging conditions. It pioneers systematic, quantitative characterization of content across four dimensions: spatial complexity, temporal dynamics, chrominance, and luminance—thereby enhancing both content diversity and imaging controllability. A unified subjective-objective evaluation framework is proposed, integrating PSNR, MS-SSIM, VMAF, and Mean Opinion Score (MOS), and compatible with both conventional and learned codecs. Contribution/Results: USTC-TD was designated the sole official benchmark for the IEEE VCIP 2022–2023 End-to-End Coding Challenge, enabling comprehensive performance validation of state-of-the-art codecs. It serves as an authoritative assessment resource and technical reference for next-generation video coding standardization and algorithm development.
📝 Abstract
Image/video coding has been a remarkable research area for both academia and industry for many years. Testing datasets, especially high-quality image/video datasets are desirable for the justified evaluation of coding-related research, practical applications, and standardization activities. We put forward a test dataset namely USTC-TD, which has been successfully adopted in the practical end-to-end image/video coding challenge of the IEEE International Conference on Visual Communications and Image Processing (VCIP) in 2022 and 2023. USTC-TD contains 40 images at 4K spatial resolution and 10 video sequences at 1080p spatial resolution, featuring various content due to the diverse environmental factors (e.g. scene type, texture, motion, view) and the designed imaging factors (e.g. illumination, lens, shadow). We quantitatively evaluate USTC-TD on different image/video features (spatial, temporal, color, lightness), and compare it with the previous image/video test datasets, which verifies its excellent compensation for the shortcomings of existing datasets. We also evaluate both classic standardized and recently learned image/video coding schemes on USTC-TD using objective quality metrics (PSNR, MS-SSIM, VMAF) and subjective quality metric (MOS), providing an extensive benchmark for these evaluated schemes. Based on the characteristics and specific design of the proposed test dataset, we analyze the benchmark performance and shed light on the future research and development of image/video coding. All the data are released online: https://esakak.github.io/USTC-TD.