Exploration of Unary Arithmetic-Based Matrix Multiply Units for Low Precision DL Accelerators

📅 2024-07-01
🏛️ IEEE Computer Society Annual Symposium on VLSI
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
This work evaluates the energy efficiency and performance potential of unary arithmetic for matrix multiplication (GEMM) in low-precision deep learning accelerators. It presents the first rigorous post-synthesis hardware assessment of three state-of-the-art unary GEMM architectures—uGEMM, tuGEMM, and tubGEMM—systematically analyzing their behavior across varying bit widths, matrix dimensions, and realistic weight sparsity patterns from actual models, including CNNs and LLaMA2. The results demonstrate that, under specific configurations, unary GEMM can significantly outperform conventional binary designs, offering a promising high-efficiency computing paradigm for edge AI inference and clearly delineating its optimal application scenarios.

Technology Category

Application Category

📝 Abstract
General matrix multiplication (GEMM) is a fundamental operation in deep learning (DL). With DL moving increasingly toward low precision, recent works have proposed novel unary G EMM designs as an alternative to conventional binary GEMM hardware. A rigorous evaluation of recent unary and binary G EMM designs is needed to assess the potential of unary hardware for future DL compute. This paper focuses on unary GEMM designs for integer-based DL inference and performs a detailed evaluation of three latest unary design proposals, namely, uGEMM, tuGEMM and tubGEMM, by comparing them to a conventional binary G EMM. Rigorous post-synthesis evaluations beyond prior works are performed across varying bit-widths and matrix sizes to assess the designs' tradeoffs and determine optimal sweetspots. Further, we perform weight sparsity analysis across eight pretrained convolutional neural networks (CNNs) and the LLaMA2 large language model (LLM). In this work we demonstrate how unary G EMM can be effectively used for energy-efficient compute in future edge AI accelerators.
Problem

Research questions and friction points this paper is trying to address.

unary GEMM
low precision
DL accelerators
matrix multiplication
energy-efficient compute
Innovation

Methods, ideas, or system contributions that make the work stand out.

unary arithmetic
GEMM
low-precision DL accelerators
energy-efficient computing
post-synthesis evaluation
🔎 Similar Papers
No similar papers found.
P
Prabhu Vellaisamy
Electrical and Computer Engineering Department, Carnegie Mellon University
H
Harideep Nair
Electrical and Computer Engineering Department, Carnegie Mellon University
Di Wu
Di Wu
University of Central Florida
computer architecturedomain-specific accelerationemerging computingfun researchgpu system
S
Shawn Blanton
Electrical and Computer Engineering Department, Carnegie Mellon University
John Paul Shen
John Paul Shen
Carnegie Mellon University
Computer Architecture