🤖 AI Summary
This work addresses the limitations of existing real-time super-resolution methods in compressed video streaming scenarios, where performance is often degraded and benchmarking resources are scarce. To bridge this gap, we introduce StreamSR, a new dataset derived from YouTube videos encompassing diverse content and resolutions, and present a systematic evaluation of eleven state-of-the-art real-time super-resolution models. Furthermore, we propose EfRLFN, an efficient model that integrates hyperbolic tangent activation with a lightweight channel attention mechanism, enhanced by a composite loss function and architectural refinements. EfRLFN achieves significant visual quality improvements while maintaining real-time inference speeds. Experimental results demonstrate that fine-tuning on StreamSR consistently yields notable performance gains across multiple benchmarks. The dataset, code, and evaluation framework are publicly released to support future research.
📝 Abstract
Recent advancements in real-time super-resolution have enabled higher-quality video streaming, yet existing methods struggle with the unique challenges of compressed video content. Commonly used datasets do not accurately reflect the characteristics of streaming media, limiting the relevance of current benchmarks. To address this gap, we introduce a comprehensive dataset - StreamSR - sourced from YouTube, covering a wide range of video genres and resolutions representative of real-world streaming scenarios. We benchmark 11 state-of-the-art real-time super-resolution models to evaluate their performance for the streaming use-case. Furthermore, we propose EfRLFN, an efficient real-time model that integrates Efficient Channel Attention and a hyperbolic tangent activation function - a novel design choice in the context of real-time super-resolution. We extensively optimized the architecture to maximize efficiency and designed a composite loss function that improves training convergence. EfRLFN combines the strengths of existing architectures while improving both visual quality and runtime performance. Finally, we show that fine-tuning other models on our dataset results in significant performance gains that generalize well across various standard benchmarks. We made the dataset, the code, and the benchmark available at https://github.com/EvgeneyBogatyrev/EfRLFN.