Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning

📅 2025-01-06

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

Existing evaluations of low-bit quantization (e.g., INT4/INT2) on large language models (LLMs) lack fine-grained analysis of its impact on mathematical reasoning—particularly the distinction between numerical computation and reasoning planning capabilities. Method: We propose the first multi-dimensional evaluation framework specifically for quantization effects on mathematical reasoning, decoupling these two core capabilities. Our approach integrates layer-wise sensitivity analysis, step-level reasoning trajectory comparison, and quantitative tracking across capability dimensions. Contribution/Results: Experiments on benchmarks such as MATH reveal that reasoning planning degrades significantly (up to −38%), whereas numerical computation remains comparatively robust. Critical vulnerability points are identified in attention intermediate layers and MLP output representations. The framework provides an interpretable, capability-aware diagnostic tool for quantization-robustness optimization, enabling targeted mitigation strategies for mathematically demanding tasks.

Technology Category

Application Category

📝 Abstract

Large language models have achieved significant advancements in complex mathematical reasoning benchmarks, such as MATH. However, their substantial computational requirements present challenges for practical deployment. Model quantization has emerged as an effective strategy to reduce memory usage and computational costs by employing lower precision and bit-width representations. In this study, we systematically evaluate the impact of quantization on mathematical reasoning tasks. We introduce a multidimensional evaluation framework that qualitatively assesses specific capability dimensions and conduct quantitative analyses on the step-by-step outputs of various quantization methods. Our results demonstrate that quantization differentially affects numerical computation and reasoning planning abilities, identifying key areas where quantized models experience performance degradation.

Problem

Research questions and friction points this paper is trying to address.

Large Language Models

Model Quantization

Mathematical Reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-Angle Evaluation System

Language Model Quantization

Computational Efficiency Optimization

🔎 Similar Papers

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models