Beyond Stationarity: Rethinking Codebook Collapse in Vector Quantization

📅 2026-02-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

194K/year
🤖 AI Summary
This work addresses the codebook collapse problem in vector quantization, which stems from the non-stationarity of encoder updates and leads to a large fraction of code vectors becoming inactive during training. For the first time, this phenomenon is explicitly attributed to encoder non-stationarity. To mitigate this issue, the paper proposes two novel approaches: a kernel-based update rule (NSVQ) and a lightweight Transformer-based mapping (TransVQ), both designed to maintain the convergence properties of k-means while ensuring full and efficient utilization of the codebook. Experimental results on the CelebA-HQ dataset demonstrate that the proposed methods achieve nearly 100% codebook usage and significantly outperform existing baselines in terms of reconstruction quality.

Technology Category

Application Category

📝 Abstract
Vector Quantization (VQ) underpins many modern generative frameworks such as VQ-VAE, VQ-GAN, and latent diffusion models. Yet, it suffers from the persistent problem of codebook collapse, where a large fraction of code vectors remains unused during training. This work provides a new theoretical explanation by identifying the nonstationary nature of encoder updates as the fundamental cause of this phenomenon. We show that as the encoder drifts, unselected code vectors fail to receive updates and gradually become inactive. To address this, we propose two new methods: Non-Stationary Vector Quantization (NSVQ), which propagates encoder drift to non-selected codes through a kernel-based rule, and Transformer-based Vector Quantization (TransVQ), which employs a lightweight mapping to adaptively transform the entire codebook while preserving convergence to the k-means solution. Experiments on the CelebA-HQ dataset demonstrate that both methods achieve near-complete codebook utilization and superior reconstruction quality compared to baseline VQ variants, providing a principled and scalable foundation for future VQ-based generative models. The code is available at: https://github.com/CAIR- LAB- WFUSM/NSVQ-TransVQ.git
Problem

Research questions and friction points this paper is trying to address.

codebook collapse
vector quantization
nonstationarity
encoder drift
generative models
Innovation

Methods, ideas, or system contributions that make the work stand out.

codebook collapse
nonstationary vector quantization
encoder drift
Transformer-based VQ
codebook utilization
🔎 Similar Papers
2024-10-08IEEE International Conference on Acoustics, Speech, and Signal ProcessingCitations: 0