Standalone 16-bit Neural Network Training: Missing Study for Hardware-Limited Deep Learning Practitioners

📅 2023-05-18
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
Whether 16-bit floating-point (FP16) arithmetic alone can robustly support end-to-end neural network training under hardware resource constraints remains a long-standing, systematically unverified question. Method: This paper introduces the first unified framework for classification-tolerant error analysis, integrating theoretical floating-point error modeling with large-scale empirical validation, and implements fully FP16 forward and backward propagation—without any FP32 “master weights” or loss scaling. Contribution/Results: Rigorous evaluation on CIFAR-10, CIFAR-100, and ImageNet demonstrates that pure FP16 training achieves accuracy parity with FP32 or mixed-precision baselines (accuracy deviation ≤ ±0.1%), while accelerating training by 1.8×. All experiments deploy seamlessly on mainstream GPUs with zero code modification. This work bridges a critical gap between theoretical guarantees and practical feasibility of low-precision training.
📝 Abstract
With the increasing complexity of machine learning models, managing computational resources like memory and processing power has become a critical concern. Mixed precision techniques, which leverage different numerical precisions during model training and inference to optimize resource usage, have been widely adopted. However, access to hardware that supports lower precision formats (e.g., FP8 or FP4) remains limited, especially for practitioners with hardware constraints. For many with limited resources, the available options are restricted to using 32-bit, 16-bit, or a combination of the two. While it is commonly believed that 16-bit precision can achieve results comparable to full (32-bit) precision, this study is the first to systematically validate this assumption through both rigorous theoretical analysis and extensive empirical evaluation. Our theoretical formalization of floating-point errors and classification tolerance provides new insights into the conditions under which 16-bit precision can approximate 32-bit results. This study fills a critical gap, proving for the first time that standalone 16-bit precision neural networks match 32-bit and mixed-precision in accuracy while boosting computational speed. Given the widespread availability of 16-bit across GPUs, these findings are especially valuable for machine learning practitioners with limited hardware resources to make informed decisions.
Problem

Research questions and friction points this paper is trying to address.

Validate 16-bit precision matches 32-bit accuracy
Explore conditions for 16-bit precision approximation
Enable efficient neural network training on limited hardware
Innovation

Methods, ideas, or system contributions that make the work stand out.

Standalone 16-bit precision matches 32-bit accuracy
Theoretical analysis of floating-point errors and tolerance
Empirical validation boosts computational speed significantly
🔎 Similar Papers
No similar papers found.
J
Juyoung Yun
Department of Computer Science, Stony Brook University
S
Sol Choi
F
Francois Rameau
Department of Computer Science, State University of New York, Korea
Byungkon Kang
Byungkon Kang
Department of Computer Science, State University of New York, Korea
Zhoulai Fu
Zhoulai Fu
State University of New York (SUNY), Korea
Formal MethodsAutomated TestingScientific ComputingMachine Learning