π€ AI Summary
To address catastrophic forgetting in continual relation extraction (CRE), this paper proposes Error-Caseβguided Instructional Contrastive Tuning (EC-ICT). EC-ICT is the first method to systematically leverage erroneous predictions on historical tasks, constructing an instruction-guided dual-objective fine-tuning framework: (i) instruction tuning to enhance relational discrimination, and (ii) contrastive learning to pull representations of correct samples closer while pushing those of erroneous ones farther apart. Integrated with memory replay and separate training on erroneous/correct samples, EC-ICT dynamically rectifies model cognitive biases. Evaluated on TACRED and FewRel, EC-ICT substantially outperforms existing state-of-the-art methods. Results demonstrate that error cases play a pivotal role in mitigating knowledge forgetting and bridging representational gaps between old and new tasks. This work establishes a novel paradigm for continual relation learning with large language models.
π Abstract
Continual Relation Extraction (CRE) aims to continually learn new emerging relations while avoiding catastrophic forgetting. Existing CRE methods mainly use memory replay and contrastive learning to mitigate catastrophic forgetting. However, these methods do not attach importance to the error cases that can reveal the model's cognitive biases more effectively. To address this issue, we propose an instruction-based continual contrastive tuning approach for Large Language Models (LLMs) in CRE. Different from existing CRE methods that typically handle the training and memory data in a unified manner, this approach splits the training and memory data of each task into two parts respectively based on the correctness of the initial responses and treats them differently through dual-task fine-tuning. In addition, leveraging the advantages of LLM's instruction-following ability, we propose a novel instruction-based contrastive tuning strategy for LLM to continuously correct current cognitive biases with the guidance of previous data in an instruction-tuning manner, which mitigates the gap between old and new relations in a more suitable way for LLMs. We experimentally evaluate our model on TACRED and FewRel, and the results show that our model achieves new state-of-the-art CRE performance with significant improvements, demonstrating the importance of specializing in exploiting error cases.