Explaining How Quantization Disparately Skews a Model

📅 2025-09-08

📈 Citations: 0

✨ Influential: 0

career value

161K/year

🤖 AI Summary

Post-training quantization (PTQ) exacerbates model unfairness toward minority groups, manifesting as reduced logit variance, increased group-wise loss, and degraded accuracy. Method: We provide the first optimization-theoretic analysis of PTQ’s disparate impact across groups—specifically, its differential effects on gradient norms and Hessian eigenvalue spectra—and elucidate a cascading fairness degradation mechanism induced by weight and activation perturbations. Building on this, we propose a fairness-aware quantization framework integrating mixed-precision quantization-aware training (QAT), group-aware data resampling, and weighted loss optimization. Contribution/Results: Leveraging Hessian-based theoretical analysis and extensive experiments, our method significantly narrows inter-group performance gaps while preserving overall accuracy, thereby enhancing fairness in quantized model deployment.

Technology Category

Application Category

📝 Abstract

Post Training Quantization (PTQ) is widely adopted due to its high compression capacity and speed with minimal impact on accuracy. However, we observed that disparate impacts are exacerbated by quantization, especially for minority groups. Our analysis explains that in the course of quantization there is a chain of factors attributed to a disparate impact across groups during forward and backward passes. We explore how the changes in weights and activations induced by quantization cause cascaded impacts in the network, resulting in logits with lower variance, increased loss, and compromised group accuracies. We extend our study to verify the influence of these impacts on group gradient norms and eigenvalues of the Hessian matrix, providing insights into the state of the network from an optimization point of view. To mitigate these effects, we propose integrating mixed precision Quantization Aware Training (QAT) with dataset sampling methods and weighted loss functions, therefore providing fair deployment of quantized neural networks.

Problem

Research questions and friction points this paper is trying to address.

Quantization exacerbates disparate impacts on minority groups

Changes in weights and activations reduce logit variance and increase loss

Quantization compromises group accuracies and affects gradient optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixed precision QAT with dataset sampling

Weighted loss functions for fairness

Analyzing Hessian and gradients impacts

🔎 Similar Papers

No similar papers found.