Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning

📅 2024-04-10
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high computational cost and negative sample explosion in span-level metric learning for few-shot named entity recognition (NER), this paper proposes a two-stage decoupled framework: first detecting candidate entity spans, then classifying their types. Methodologically, we design a hybrid multi-stage decoder, incorporate an entity-oriented contrastive learning module to enhance semantic discriminability, and jointly leverage KNN retrieval and classification for robust inference. The entire framework is trained end-to-end via meta-learning and fine-tuned lightly on support sets. Compared with prevailing token-level or span-level approaches, our method significantly alleviates negative sample redundancy and computational overhead. On the FewNERD benchmark, it achieves state-of-the-art performance, particularly demonstrating superior generalization to unseen entity types.

Technology Category

Application Category

📝 Abstract
Few-shot named entity recognition can identify new types of named entities based on a few labeled examples. Previous methods employing token-level or span-level metric learning suffer from the computational burden and a large number of negative sample spans. In this paper, we propose the Hybrid Multi-stage Decoding for Few-shot NER with Entity-aware Contrastive Learning (MsFNER), which splits the general NER into two stages: entity-span detection and entity classification. There are 3 processes for introducing MsFNER: training, finetuning, and inference. In the training process, we train and get the best entity-span detection model and the entity classification model separately on the source domain using meta-learning, where we create a contrastive learning module to enhance entity representations for entity classification. During finetuning, we finetune the both models on the support dataset of target domain. In the inference process, for the unlabeled data, we first detect the entity-spans, then the entity-spans are jointly determined by the entity classification model and the KNN. We conduct experiments on the open FewNERD dataset and the results demonstrate the advance of MsFNER.
Problem

Research questions and friction points this paper is trying to address.

Identifies new entity types using minimal labeled examples
Reduces computational burden from excessive negative span samples
Separates entity detection and classification via hybrid decoding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid multi-stage decoding splits NER into two stages
Entity-aware contrastive learning enhances entity representations
Joint entity classification combines model predictions with KNN
🔎 Similar Papers
No similar papers found.
P
Peipei Liu
Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences
G
Gao-Shan Wang
Institute of Information Engineering, Chinese Academy of Sciences; School of Cyber Security, University of Chinese Academy of Sciences
Y
Ying Tong
Jiangsu Provincial Department of Public Security, China
Jian Liang
Jian Liang
Kuaishou Inc.
transfer learninggraph learning
Z
Zhenquan Ding
Institute of Information Engineering, Chinese Academy of Sciences
Hongsong Zhu
Hongsong Zhu
institute of information Engineering, Chinese Academy of Sciences
cybersecurityinternet measurement