🤖 AI Summary
Existing implicit neural representation (INR)-based image compression methods suffer from weak expressiveness of latent variables, underperforming end-to-end (E2E) approaches; conversely, E2E methods rely on entropy coding to transmit latent codes, incurring high decoding complexity. To address this trade-off, we propose a novel compression paradigm that eliminates the need to transmit latent codes. Our method leverages a shared random seed to deterministically generate multi-scale Gaussian noise tensors. A learnable Gaussian parameter prediction module—combined with the reparameterization trick—enables deterministic reconstruction of image-specific latent variables, which are then fed into an INR network for image synthesis. This work is the first to introduce noise-driven, learned latent variable generation into image compression. Evaluated on the Kodak and CLIC benchmarks, our approach achieves state-of-the-art rate-distortion performance while significantly reducing decoding overhead.
📝 Abstract
Recent implicit neural representation (INR)-based image compression methods have shown competitive performance by overfitting image-specific latent codes. However, they remain inferior to end-to-end (E2E) compression approaches due to the absence of expressive latent representations. On the other hand, E2E methods rely on transmitting latent codes and requiring complex entropy models, leading to increased decoding complexity. Inspired by the normalization strategy in E2E codecs where latents are transformed into Gaussian noise to demonstrate the removal of spatial redundancy, we explore the inverse direction: generating latents directly from Gaussian noise. In this paper, we propose a novel image compression paradigm that reconstructs image-specific latents from a multi-scale Gaussian noise tensor, deterministically generated using a shared random seed. A Gaussian Parameter Prediction (GPP) module estimates the distribution parameters, enabling one-shot latent generation via reparameterization trick. The predicted latent is then passed through a synthesis network to reconstruct the image. Our method eliminates the need to transmit latent codes while preserving latent-based benefits, achieving competitive rate-distortion performance on Kodak and CLIC dataset. To the best of our knowledge, this is the first work to explore Gaussian latent generation for learned image compression.