On Low-Bit Quantization Errors in Speaker Verification: Diagnostic and Mitigation

📅 2026-06-06

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the significant performance degradation commonly observed in speaker verification under low-bit quantization, a phenomenon whose underlying mechanisms remain poorly understood. Through inter-layer weight analysis and score-level error tracing, the study reveals—for the first time—a performance inflection point at 2-bit precision and harmful decision flips near critical thresholds. Building on these insights, the authors propose a multi-precision cascaded calibration strategy that dynamically elevates quantization precision only for ambiguous samples, integrating uniform K-means quantization-aware training with an efficient inference mechanism. This approach substantially reduces computational and memory costs while preserving verification accuracy close to that of FP32 models, thereby enabling efficient and reliable low-bit speaker verification.

📝 Abstract

Although low-bit quantization provides practical means to deploy speaker verification on resource-constrained devices, its effects on speaker verification performance remain poorly understood. In this paper, we study uniform K-means quantization-aware training of ResNet-36 and ResNet-200 through joint layer-wise and score-level analyses. Our layer-wise analysis highlights fragile components and shows that score degradation is not fully explained by weight distortion alone. We identify a clear knee point at 2 bits, with larger score drift and harmful decision flips concentrated near the FP32 threshold. Our score-level analysis reveals where and how score errors emerge under extreme quantization. Building on these findings, we propose a calibrated multi-precision cascade that resolves most trials at 2 bits and escalates only ambiguous cases, achieving performance close to FP32 while preserving the efficiency benefits of low-bit inference with substantially lower compute and memory costs.

Problem

Research questions and friction points this paper is trying to address.

low-bit quantization

speaker verification

quantization error

score degradation

decision flip

Innovation

Methods, ideas, or system contributions that make the work stand out.

low-bit quantization

speaker verification

quantization-aware training