Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification

📅 2025-02-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deploying Transformer models for time-series classification on edge devices faces challenges of high energy consumption and computational overhead. Method: This paper proposes an energy-efficiency-oriented co-optimization framework, featuring the first systematic investigation into the combined impact of L1-norm-based structured pruning and 8-bit static quantization on time-series Transformers. The approach integrates structured pruning, static quantization, and architecture-level lightweight adaptation. Evaluation is conducted on three benchmark datasets: RefrigerationDevices, ElectricDevices, and PLAID. Results: Static quantization reduces inference energy consumption by 29.14% without accuracy loss; L1 pruning improves inference speed by 1.63% with negligible accuracy degradation (≤0.17%). Their synergistic combination significantly enhances the energy-efficiency ratio, demonstrating the feasibility and practicality of the proposed method for resource-constrained edge environments.

Technology Category

Application Category

📝 Abstract
The increasing computational demands of transformer models in time series classification necessitate effective optimization strategies for energy-efficient deployment. This paper presents a systematic investigation of optimization techniques, focusing on structured pruning and quantization methods for transformer architectures. Through extensive experimentation on three distinct datasets (RefrigerationDevices, ElectricDevices, and PLAID), we quantitatively evaluate model performance and energy efficiency across different transformer configurations. Our experimental results demonstrate that static quantization reduces energy consumption by 29.14% while maintaining classification performance, and L1 pruning achieves a 1.63% improvement in inference speed with minimal accuracy degradation. These findings provide valuable insights into the effectiveness of optimization strategies for transformer-based time series classification, establishing a foundation for efficient model deployment in resource-constrained environments.
Problem

Research questions and friction points this paper is trying to address.

Optimize transformer energy efficiency
Enhance time series classification
Implement pruning and quantization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Structured pruning enhances inference speed
Static quantization reduces energy consumption
Optimization maintains classification performance efficiently
🔎 Similar Papers
No similar papers found.
A
Arshia Kermani
Department of Computer Science, Texas State University
Ehsan Zeraatkar
Ehsan Zeraatkar
Department of Computer Science,Texas State University
Machine LearningDeep LearningComputer VisionArtificial intelligence
H
Habib Irani
Department of Computer Science, Texas State University