Multi-Attribute Multi-Grained Adaptation of Pre-Trained Language Models for Text Understanding from Bayesian Perspective

📅 2025-03-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The impact mechanism of non-IID text data on pre-trained language models (PLMs) remains poorly understood. Method: This paper introduces, for the first time, a Bayesian inference framework for modeling PLM adaptability under non-IID conditions, proposing the Multi-Attribute Multi-Granularity Adaptive (M2A) framework. M2A jointly models semantic, stylistic, and domain-specific attributes alongside lexical, sentential, and document-level granularities, leveraging hierarchical attention, gradient-isolated fine-tuning, and uncertainty calibration to achieve lightweight, uncertainty-aware adaptation. Contribution/Results: Extensive experiments demonstrate that M2A significantly outperforms strong baselines across mainstream text understanding tasks. Under implicit non-IID settings and with large-parameter PLMs, it yields consistent gains of 2.1–4.7 percentage points while increasing inference overhead by less than 3%.

Technology Category

Application Category

📝 Abstract
Current neural networks often employ multi-domain-learning or attribute-injecting mechanisms to incorporate non-independent and identically distributed (non-IID) information for text understanding tasks by capturing individual characteristics and the relationships among samples. However, the extent of the impact of non-IID information and how these methods affect pre-trained language models (PLMs) remains unclear. This study revisits the assumption that non-IID information enhances PLMs to achieve performance improvements from a Bayesian perspective, which unearths and integrates non-IID and IID features. Furthermore, we proposed a multi-attribute multi-grained framework for PLM adaptations (M2A), which combines multi-attribute and multi-grained views to mitigate uncertainty in a lightweight manner. We evaluate M2A through prevalent text-understanding datasets and demonstrate its superior performance, mainly when data are implicitly non-IID, and PLMs scale larger.
Problem

Research questions and friction points this paper is trying to address.

Impact of non-IID information on pre-trained language models.
Bayesian perspective for integrating non-IID and IID features.
Multi-attribute multi-grained framework for PLM adaptation.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian perspective integrates non-IID and IID features
Multi-attribute multi-grained framework (M2A) for PLM adaptations
Lightweight method mitigates uncertainty in text understanding
🔎 Similar Papers
No similar papers found.
Y
You Zhang
School of Information Science and Engineering, Yunnan University, Yunnan, P.R.China
J
Jin Wang
School of Information Science and Engineering, Yunnan University, Yunnan, P.R.China
Liang-Chih Yu
Liang-Chih Yu
Yuan Ze university
Natural Language ProcessingSentiment AnalysisText MiningLearning Technology
D
Dan Xu
School of Information Science and Engineering, Yunnan University, Yunnan, P.R.China
X
Xuejie Zhang
School of Information Science and Engineering, Yunnan University, Yunnan, P.R.China