USTC-TD: A Test Dataset and Benchmark for Image and Video Coding in 2020s

📅 2024-09-13
🏛️ arXiv.org
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
The 2020s witnessed a critical shortage of high-quality, controllable test datasets for image/video coding research. Method: This paper introduces the USTC-TD benchmark—a curated dataset comprising 40 4K images and 10 1080p videos—spanning diverse scenes, textures, motion patterns, and imaging conditions. It pioneers systematic, quantitative characterization of content across four dimensions: spatial complexity, temporal dynamics, chrominance, and luminance—thereby enhancing both content diversity and imaging controllability. A unified subjective-objective evaluation framework is proposed, integrating PSNR, MS-SSIM, VMAF, and Mean Opinion Score (MOS), and compatible with both conventional and learned codecs. Contribution/Results: USTC-TD was designated the sole official benchmark for the IEEE VCIP 2022–2023 End-to-End Coding Challenge, enabling comprehensive performance validation of state-of-the-art codecs. It serves as an authoritative assessment resource and technical reference for next-generation video coding standardization and algorithm development.

Technology Category

Application Category

📝 Abstract
Image/video coding has been a remarkable research area for both academia and industry for many years. Testing datasets, especially high-quality image/video datasets are desirable for the justified evaluation of coding-related research, practical applications, and standardization activities. We put forward a test dataset namely USTC-TD, which has been successfully adopted in the practical end-to-end image/video coding challenge of the IEEE International Conference on Visual Communications and Image Processing (VCIP) in 2022 and 2023. USTC-TD contains 40 images at 4K spatial resolution and 10 video sequences at 1080p spatial resolution, featuring various content due to the diverse environmental factors (e.g. scene type, texture, motion, view) and the designed imaging factors (e.g. illumination, lens, shadow). We quantitatively evaluate USTC-TD on different image/video features (spatial, temporal, color, lightness), and compare it with the previous image/video test datasets, which verifies its excellent compensation for the shortcomings of existing datasets. We also evaluate both classic standardized and recently learned image/video coding schemes on USTC-TD using objective quality metrics (PSNR, MS-SSIM, VMAF) and subjective quality metric (MOS), providing an extensive benchmark for these evaluated schemes. Based on the characteristics and specific design of the proposed test dataset, we analyze the benchmark performance and shed light on the future research and development of image/video coding. All the data are released online: https://esakak.github.io/USTC-TD.
Problem

Research questions and friction points this paper is trying to address.

Develops USTC-TD for image/video coding evaluation.
Addresses lack of high-quality test datasets.
Benchmarks coding schemes using objective/subjective metrics.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed USTC-TD for image/video coding evaluation
Includes 4K images and 1080p videos with diverse content
Benchmarked using PSNR, MS-SSIM, VMAF, and MOS metrics
🔎 Similar Papers
No similar papers found.
Zhuoyuan Li
Zhuoyuan Li
University of Science and Technology of China (USTC)
Video CodingInter/Intra PredictionIn-Loop FilteringLearned Compression
Junqi Liao
Junqi Liao
University of Science and Technology of China
video codingreinforcement learning
Chuanbo Tang
Chuanbo Tang
University of Science and Technology of China
video compression, image compression
H
Haotian Zhang
MOE Key Laboratory of Brain-Inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei 230027, China
Y
Yuqi Li
MOE Key Laboratory of Brain-Inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei 230027, China
Yifan Bian
Yifan Bian
University of Science & Technology of China
Deep learningend-to-end based image/video compression
Xihua Sheng
Xihua Sheng
University of Science and Technology of China->City University of Hong Kong
Video codingImage codingPoint Cloud coding
Xinmin Feng
Xinmin Feng
University of Science and Technology of China
video compression
Y
Yao Li
MOE Key Laboratory of Brain-Inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei 230027, China
Changsheng Gao
Changsheng Gao
Nanyang Technological University
video codingfeature codingcoding for machines
L
Li Li
MOE Key Laboratory of Brain-Inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei 230027, China
D
Dong Liu
MOE Key Laboratory of Brain-Inspired Intelligent Perception and Cognition, University of Science and Technology of China, Hefei 230027, China
Feng Wu
Feng Wu
National University of Singapore
Mechine LearningMedical Time Series