Small Language Model Makes an Effective Long Text Extractor

📅 2025-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of extracting long-span named entities (e.g., awards) from lengthy documents—namely, low accuracy, high computational redundancy, and excessive GPU memory consumption—this paper proposes SeNER, a lightweight span-based NER model. Methodologically, it introduces a novel bidirectional arrow attention mechanism and a [CLS]-based LogN-scaled embedding to drastically reduce sequence modeling complexity; additionally, it designs bidirectional sliding-window plus asterisk-shaped attention (BiSPA) to jointly model span interactions while substantially pruning candidate spans. Built upon a compact language model architecture, SeNER achieves state-of-the-art performance across three long-text NER benchmarks, with an average F1-score improvement of 1.2%, 37–52% reduction in GPU memory usage, and a 2.1× speedup in inference time—demonstrating superior accuracy, efficiency, and memory efficiency.

Technology Category

Application Category

📝 Abstract
Named Entity Recognition (NER) is a fundamental problem in natural language processing (NLP). However, the task of extracting longer entity spans (e.g., awards) from extended texts (e.g., homepages) is barely explored. Current NER methods predominantly fall into two categories: span-based methods and generation-based methods. Span-based methods require the enumeration of all possible token-pair spans, followed by classification on each span, resulting in substantial redundant computations and excessive GPU memory usage. In contrast, generation-based methods involve prompting or fine-tuning large language models (LLMs) to adapt to downstream NER tasks. However, these methods struggle with the accurate generation of longer spans and often incur significant time costs for effective fine-tuning. To address these challenges, this paper introduces a lightweight span-based NER method called SeNER, which incorporates a bidirectional arrow attention mechanism coupled with LogN-Scaling on the [CLS] token to embed long texts effectively, and comprises a novel bidirectional sliding-window plus-shaped attention (BiSPA) mechanism to reduce redundant candidate token-pair spans significantly and model interactions between token-pair spans simultaneously. Extensive experiments demonstrate that our method achieves state-of-the-art extraction accuracy on three long NER datasets and is capable of extracting entities from long texts in a GPU-memory-friendly manner. Code: https://github.com/THUDM/scholar-profiling/tree/main/sener
Problem

Research questions and friction points this paper is trying to address.

Long text entity extraction
Redundant computation reduction
GPU memory efficiency optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bidirectional arrow attention mechanism
LogN-Scaling on [CLS] token
BiSPA reduces redundant token-pair spans
🔎 Similar Papers
No similar papers found.
Y
Yelin Chen
School of Computer Science and Technology, Xinjiang University, Urumqi 830049, China
Fanjin Zhang
Fanjin Zhang
Tsinghua University
data miningmachine learning
Jie Tang
Jie Tang
UW Madison
Computed Tomography