Information-Theoretic Complementary Prompts for Improved Continual Text Classification

๐Ÿ“… 2025-05-27
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address catastrophic forgetting and cross-task knowledge transfer in continual text classification (CTC), this paper proposes a dual-prompt space architecture: private prompts (P-Prompts) capture task-specific knowledge, while shared prompts (S-Prompts) encode task-invariant representations. We introduce the first information-theoretic complementary prompting mechanism, instantiated via two mutual information maximization objectivesโ€”one to mitigate forgetting and the other to enhance forward transfer. The method integrates prompt learning, contrastive representation learning, and dual-stream parameterized prompt modeling, enabling sequential learning without data replay. Evaluated on multiple CTC benchmarks, our approach significantly outperforms state-of-the-art methods, effectively alleviating catastrophic forgetting and improving generalization on novel tasks.

Technology Category

Application Category

๐Ÿ“ Abstract
Continual Text Classification (CTC) aims to continuously classify new text data over time while minimizing catastrophic forgetting of previously acquired knowledge. However, existing methods often focus on task-specific knowledge, overlooking the importance of shared, task-agnostic knowledge. Inspired by the complementary learning systems theory, which posits that humans learn continually through the interaction of two systems -- the hippocampus, responsible for forming distinct representations of specific experiences, and the neocortex, which extracts more general and transferable representations from past experiences -- we introduce Information-Theoretic Complementary Prompts (InfoComp), a novel approach for CTC. InfoComp explicitly learns two distinct prompt spaces: P(rivate)-Prompt and S(hared)-Prompt. These respectively encode task-specific and task-invariant knowledge, enabling models to sequentially learn classification tasks without relying on data replay. To promote more informative prompt learning, InfoComp uses an information-theoretic framework that maximizes mutual information between different parameters (or encoded representations). Within this framework, we design two novel loss functions: (1) to strengthen the accumulation of task-specific knowledge in P-Prompt, effectively mitigating catastrophic forgetting, and (2) to enhance the retention of task-invariant knowledge in S-Prompt, improving forward knowledge transfer. Extensive experiments on diverse CTC benchmarks show that our approach outperforms previous state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Improves continual text classification by minimizing catastrophic forgetting
Learns task-specific and task-invariant knowledge using complementary prompts
Enhances knowledge transfer and retention via information-theoretic loss functions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual prompt spaces for task-specific and invariant knowledge
Information-theoretic framework maximizes mutual information
Novel loss functions mitigate catastrophic forgetting
๐Ÿ”Ž Similar Papers
No similar papers found.