Cat: Post-training quantization error reduction via cluster-based affine transformation

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

Low-bit post-training quantization (PTQ) often incurs substantial accuracy degradation, exacerbated by conventional uniform affine transformations. To address this, we propose Clustering-based Affine Transformation (CAT), a method that learns cluster-specific affine parameters for distinct output clusters—thereby aligning quantized and full-precision output distributions with near-zero parameter overhead. CAT operates as a plug-and-play module requiring no fine-tuning or retraining, enabling seamless integration into existing PTQ pipelines. On ImageNet-1K, CAT achieves 53.18% Top-1 accuracy for W2A2 ResNet-18—surpassing the state-of-the-art by over 3%. It demonstrates consistent robustness across diverse architectures and quantization configurations. The core innovation lies in coupling clustering analysis with cluster-level affine calibration, effectively mitigating distribution mismatch—a critical challenge in low-bit PTQ.

Technology Category

Application Category

📝 Abstract

Post-Training Quantization (PTQ) reduces the memory footprint and computational overhead of deep neural networks by converting full-precision (FP) values into quantized and compressed data types. While PTQ is more cost-efficient than Quantization-Aware Training (QAT), it is highly susceptible to accuracy degradation under a low-bit quantization (LQ) regime (e.g., 2-bit). Affine transformation is a classical technique used to reduce the discrepancy between the information processed by a quantized model and that processed by its full-precision counterpart; however, we find that using plain affine transformation, which applies a uniform affine parameter set for all outputs, worsens the results in low-bit PTQ. To address this, we propose Cluster-based Affine Transformation (CAT), an error-reduction framework that employs cluster-specific parameters to align LQ outputs with FP counterparts. CAT refines LQ outputs with only a negligible number of additional parameters, without requiring fine-tuning of the model or quantization parameters. We further introduce a novel PTQ framework integrated with CAT. Experiments on ImageNet-1K show that this framework consistently outperforms prior PTQ methods across diverse architectures and LQ settings, achieving up to 53.18% Top-1 accuracy on W2A2 ResNet-18. Moreover, CAT enhances existing PTQ baselines by more than 3% when used as a plug-in. We plan to release our implementation alongside the publication of this paper.

Problem

Research questions and friction points this paper is trying to address.

Reducing accuracy loss in low-bit post-training quantization

Improving output alignment between quantized and full-precision models

Enhancing quantization performance without fine-tuning parameters

Innovation

Methods, ideas, or system contributions that make the work stand out.

Cluster-based affine transformation reduces quantization error

Employs cluster-specific parameters for low-bit quantization

Enhances PTQ without fine-tuning or extra parameters

🔎 Similar Papers

Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction