Federated Incremental Named Entity Recognition

📅 2024-11-18
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address dual heterogeneous forgetting—across clients and within clients—arising from dynamic entity-type expansion and client growth in Federated Named Entity Recognition (FNER), this paper proposes a novel Federated Incremental NER paradigm. To mitigate performance degradation on previously learned entity types, we design a Local-Global Forgetting Defense (LGFD) framework: (i) a structured knowledge distillation loss preserves historical knowledge; (ii) a pseudo-label-guided cross-type contrastive loss enhances type discriminability; and (iii) a privacy-preserving task-switching monitoring mechanism enables adaptive client participation. Integrating federated learning, incremental learning, knowledge distillation, and contrastive learning, our method achieves state-of-the-art performance across multiple standard benchmarks, improving joint accuracy on both old and newly introduced entities by an average of 4.2%. It is the first work to systematically resolve heterogeneous forgetting in dynamic FNER scenarios.

Technology Category

Application Category

📝 Abstract
Federated Named Entity Recognition (FNER) boosts model training within each local client by aggregating the model updates of decentralized local clients, without sharing their private data. However, existing FNER methods assume fixed entity types and local clients in advance, leading to their ineffectiveness in practical applications. In a more realistic scenario, local clients receive new entity types continuously, while new local clients collecting novel data may irregularly join the global FNER training. This challenging setup, referred to here as Federated Incremental NER, renders the global model suffering from heterogeneous forgetting of old entity types from both intra-client and inter-client perspectives. To overcome these challenges, we propose a Local-Global Forgetting Defense (LGFD) model. Specifically, to address intra-client forgetting, we develop a structural knowledge distillation loss to retain the latent space's feature structure and a pseudo-label-guided inter-type contrastive loss to enhance discriminative capability over different entity types, effectively preserving previously learned knowledge within local clients. To tackle inter-client forgetting, we propose a task switching monitor that can automatically identify new entity types under privacy protection and store the latest old global model for knowledge distillation and pseudo-labeling. Experiments demonstrate significant improvement of our LGFD model over comparison methods.
Problem

Research questions and friction points this paper is trying to address.

FNER lacks adaptability to new entity types and clients
Global model suffers from forgetting old entity types
Need privacy-preserving methods for incremental learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Structural knowledge distillation for intra-client forgetting
Pseudo-label-guided contrastive loss for entity types
Task switching monitor for inter-client forgetting
🔎 Similar Papers
No similar papers found.
Duzhen Zhang
Duzhen Zhang
Institute of Automation, Chinese Academy of Sciences
Natural Language ProcessingMultimodalLarge Language ModelsContinual LearningAI4Science
Yahan Yu
Yahan Yu
Kyoto University
Multimodal LLMContinual Learning
C
Chenxing Li
Tencent, AI Lab, Beijing, China
J
Jiahua Dong
Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, China
D
Dong Yu
Tencent, AI Lab, Bellevue, WA 98004 USA