ReFrame: Layer Caching for Accelerated Inference in Real-Time Rendering

📅 2025-06-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address high inference latency in neural networks for real-time rendering—where balancing image quality and frame rate remains challenging—this paper proposes a hierarchical feature caching mechanism leveraging temporal coherence. It is the first work to systematically integrate encoder-decoder intermediate-layer feature caching into real-time rendering pipelines. We design three core components: cache hit prediction, adaptive refresh, and multi-policy scheduling. Evaluated on denoising, super-resolution, and frame extrapolation tasks, our method achieves an average 1.4× speedup while incurring less than 0.1 dB PSNR degradation and preserving perceptual quality. The approach is architecture-agnostic—compatible with mainstream neural rendering frameworks—and requires no modification to network topology. By eliminating redundant computations across frames, it establishes an efficient, general-purpose paradigm for reusing intermediate representations in real-time neural rendering.

Technology Category

Application Category

📝 Abstract
Graphics rendering applications increasingly leverage neural networks in tasks such as denoising, supersampling, and frame extrapolation to improve image quality while maintaining frame rates. The temporal coherence inherent in these tasks presents an opportunity to reuse intermediate results from previous frames and avoid redundant computations. Recent work has shown that caching intermediate features to be reused in subsequent inferences is an effective method to reduce latency in diffusion models. We extend this idea to real-time rendering and present ReFrame, which explores different caching policies to optimize trade-offs between quality and performance in rendering workloads. ReFrame can be applied to a variety of encoder-decoder style networks commonly found in rendering pipelines. Experimental results show that we achieve 1.4x speedup on average with negligible quality loss in three real-time rendering tasks. Code available: https://ubc-aamodt-group.github.io/reframe-layer-caching/
Problem

Research questions and friction points this paper is trying to address.

Optimize layer caching for real-time rendering acceleration
Reduce redundant computations in neural network-based rendering tasks
Balance quality and performance in encoder-decoder rendering networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Layer caching for accelerated inference
Reuses intermediate results from frames
Optimizes quality-performance trade-offs
🔎 Similar Papers
No similar papers found.