Comprehensive and Efficient Distillation for Lightweight Sentiment Analysis Models

📅 2025-10-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limitations of knowledge distillation in lightweight sentiment analysis—namely, insufficient knowledge coverage due to human-dependent instruction design and prohibitively high computational costs for large-scale text processing—this paper proposes an efficient and comprehensive distillation framework. Methodologically, it introduces (1) an attribute-based automatic instruction generation mechanism to enhance instruction diversity and semantic coverage, and (2) a difficulty-aware data filtering module to significantly reduce redundant computation. The framework leverages collaborative distillation from multiple state-of-the-art large language models (Llama-3, Qwen-3, Gemma-3). Experimental results demonstrate that the distilled 3B student model achieves performance on par with a 60B teacher model across mainstream sentiment analysis benchmarks. Moreover, it attains baseline full-data training performance using only 10% labeled data, exhibiting strong generalization capability and high deployment efficiency.

Technology Category

Application Category

📝 Abstract
Recent efforts leverage knowledge distillation techniques to develop lightweight and practical sentiment analysis models. These methods are grounded in human-written instructions and large-scale user texts. Despite the promising results, two key challenges remain: (1) manually written instructions are limited in diversity and quantity, making them insufficient to ensure comprehensive coverage of distilled knowledge; (2) large-scale user texts incur high computational cost, hindering the practicality of these methods. To this end, we introduce COMPEFFDIST, a comprehensive and efficient distillation framework for sentiment analysis. Our framework consists of two key modules: attribute-based automatic instruction construction and difficulty-based data filtering, which correspondingly tackle the aforementioned challenges. Applying our method across multiple model series (Llama-3, Qwen-3, and Gemma-3), we enable 3B student models to match the performance of 20x larger teacher models on most tasks. In addition, our approach greatly outperforms baseline methods in data efficiency, attaining the same performance level with only 10% of the data.
Problem

Research questions and friction points this paper is trying to address.

Addresses limited diversity in manual sentiment analysis instructions
Reduces computational costs of large-scale user text processing
Enables small models to match performance of larger teacher models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Attribute-based automatic instruction construction for knowledge distillation
Difficulty-based data filtering to reduce computational costs
Enables small models to match performance of larger teacher models
🔎 Similar Papers
G
Guangyu Xie
Harbin Institute of Technology, Shenzhen, China
Y
Yice Zhang
Harbin Institute of Technology, Shenzhen, China
Jianzhu Bao
Jianzhu Bao
Nanyang Technological University
NLPComputational ArgumentationLarge Language ModelsSentiment Analysis
Q
Qianlong Wang
Harbin Institute of Technology, Shenzhen, China
Y
Yang Sun
Harbin Institute of Technology, Shenzhen, China
Bingbing Wang
Bingbing Wang
Harbin Institute of Technology, Shenzhen
natural language processing
Ruifeng Xu
Ruifeng Xu
Professor, Harbin Institute of Technology at Shenzhen
Natural Language ProcessingAffective ComputingArgumentation MiningLLMsBioinformatics