Energy-Efficient Transformer Inference: Optimization Strategies for Time Series Classification

📅 2025-02-23

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

Deploying Transformer models for time-series classification on edge devices faces challenges of high energy consumption and computational overhead. Method: This paper proposes an energy-efficiency-oriented co-optimization framework, featuring the first systematic investigation into the combined impact of L1-norm-based structured pruning and 8-bit static quantization on time-series Transformers. The approach integrates structured pruning, static quantization, and architecture-level lightweight adaptation. Evaluation is conducted on three benchmark datasets: RefrigerationDevices, ElectricDevices, and PLAID. Results: Static quantization reduces inference energy consumption by 29.14% without accuracy loss; L1 pruning improves inference speed by 1.63% with negligible accuracy degradation (≤0.17%). Their synergistic combination significantly enhances the energy-efficiency ratio, demonstrating the feasibility and practicality of the proposed method for resource-constrained edge environments.

Technology Category

Application Category

📝 Abstract

The increasing computational demands of transformer models in time series classification necessitate effective optimization strategies for energy-efficient deployment. This paper presents a systematic investigation of optimization techniques, focusing on structured pruning and quantization methods for transformer architectures. Through extensive experimentation on three distinct datasets (RefrigerationDevices, ElectricDevices, and PLAID), we quantitatively evaluate model performance and energy efficiency across different transformer configurations. Our experimental results demonstrate that static quantization reduces energy consumption by 29.14% while maintaining classification performance, and L1 pruning achieves a 1.63% improvement in inference speed with minimal accuracy degradation. These findings provide valuable insights into the effectiveness of optimization strategies for transformer-based time series classification, establishing a foundation for efficient model deployment in resource-constrained environments.

Problem

Research questions and friction points this paper is trying to address.

Optimize transformer energy efficiency

Enhance time series classification

Implement pruning and quantization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Structured pruning enhances inference speed

Static quantization reduces energy consumption

Optimization maintains classification performance efficiently

🔎 Similar Papers

No similar papers found.