Texture Vector-Quantization and Reconstruction Aware Prediction for Generative Super-Resolution

📅 2025-09-28

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

Existing vector quantization (VQ) methods for generative super-resolution suffer from two key limitations: (1) coarse-grained quantization of all visual features, leading to significant texture degradation and high quantization distortion; and (2) index predictors trained solely with codebook-level supervision, neglecting end-to-end reconstruction fidelity. To address these, we propose Texture Vector Quantization (TVQ) and Reconstruction-Aware Prediction (RAP). TVQ selectively models only high-frequency missing textures—bypassing low-frequency content—thereby substantially reducing quantization artifacts. RAP incorporates a straight-through estimator to enable end-to-end, image-level supervision, directly optimizing index prediction for reconstruction quality. Our approach achieves marked improvements in texture realism and structural consistency of super-resolved images with minimal computational overhead. Extensive experiments demonstrate state-of-the-art performance over prevailing VQ-based methods across multiple benchmarks, validating the synergistic enhancement of prior modeling accuracy and final reconstruction quality.

Technology Category

Application Category

📝 Abstract

Vector-quantized based models have recently demonstrated strong potential for visual prior modeling. However, existing VQ-based methods simply encode visual features with nearest codebook items and train index predictor with code-level supervision. Due to the richness of visual signal, VQ encoding often leads to large quantization error. Furthermore, training predictor with code-level supervision can not take the final reconstruction errors into consideration, result in sub-optimal prior modeling accuracy. In this paper we address the above two issues and propose a Texture Vector-Quantization and a Reconstruction Aware Prediction strategy. The texture vector-quantization strategy leverages the task character of super-resolution and only introduce codebook to model the prior of missing textures. While the reconstruction aware prediction strategy makes use of the straight-through estimator to directly train index predictor with image-level supervision. Our proposed generative SR model (TVQ&RAP) is able to deliver photo-realistic SR results with small computational cost.

Problem

Research questions and friction points this paper is trying to address.

Reduces quantization error in vector-quantized super-resolution models

Improves prior modeling accuracy with reconstruction-aware training

Generates photo-realistic super-resolution results with low computational cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

Texture Vector-Quantization models missing textures prior

Reconstruction Aware Prediction uses image-level supervision

Straight-through estimator trains index predictor directly

🔎 Similar Papers

No similar papers found.