NER4all or Context is All You Need: Using LLMs for low-effort, high-performance NER on historical texts. A humanities informed approach

📅 2025-02-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Historical text named entity recognition (NER) faces challenges including high linguistic heterogeneity, scarce annotated data, and strong dependence on domain-specific knowledge. Method: This paper proposes a lightweight large language model (LLM)-based prompting method tailored for humanities research, introducing a novel humanities-oriented prompting strategy that integrates historical context modeling and role-based instruction design. Empirically, we identify 16-shot as the optimal few-shot threshold—challenging the conventional assumption that “more examples yield better performance”—and require no fine-tuning, coding, or specialized computational resources. Results: On classical Chinese texts, our approach achieves 7–22% higher F1 scores than spaCy and Flair for person and location NER, with substantial improvements in both recall and precision. Deployable zero-code on consumer-grade hardware, this method establishes an efficient, accessible, and domain-adapted NER paradigm for historical scholarship.

Technology Category

Application Category

📝 Abstract
Named entity recognition (NER) is a core task for historical research in automatically establishing all references to people, places, events and the like. Yet, do to the high linguistic and genre diversity of sources, only limited canonisation of spellings, the level of required historical domain knowledge, and the scarcity of annotated training data, established approaches to natural language processing (NLP) have been both extremely expensive and yielded only unsatisfactory results in terms of recall and precision. Our paper introduces a new approach. We demonstrate how readily-available, state-of-the-art LLMs significantly outperform two leading NLP frameworks, spaCy and flair, for NER in historical documents by seven to twentytwo percent higher F1-Scores. Our ablation study shows how providing historical context to the task and a bit of persona modelling that turns focus away from a purely linguistic approach are core to a successful prompting strategy. We also demonstrate that, contrary to our expectations, providing increasing numbers of examples in few-shot approaches does not improve recall or precision below a threshold of 16-shot. In consequence, our approach democratises access to NER for all historians by removing the barrier of scripting languages and computational skills required for established NLP tools and instead leveraging natural language prompts and consumer-grade tools and frontends.
Problem

Research questions and friction points this paper is trying to address.

Enhance NER in historical texts
Utilize LLMs for better performance
Simplify NER access for historians
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs enhance NER precision
Historical context improves performance
Few-shot examples have threshold limits
🔎 Similar Papers
No similar papers found.
T
Torsten Hiltmann
Humboldt-Universität zu Berlin
M
Martin Droge
Humboldt-Universität zu Berlin, AI-Skills
N
Nicole Dresselhaus
Humboldt-Universität zu Berlin, NFDI4Memory
Till Grallert
Till Grallert
Humboldt-Universität zu Berlin
Middle East StudiesSocial HistoryPeriodical StudiesCollective ActionDigital Humanities
M
Melanie Althage
Humboldt-Universität zu Berlin
P
Paul Bayer
Humboldt-Universität zu Berlin
S
Sophie Eckenstaler
Humboldt-Universität zu Berlin, Kompetenzwerkstatt Digital Humanities
K
Koray Mendi
Humboldt-Universität zu Berlin
Philipp Schneider
Philipp Schneider
CISPA - Helmholtz Center for Information Security
Theory of Distributed Computing
W
Wiebke Sczeponik
Humboldt-Universität zu Berlin
A
Anica Skibba
Humboldt-Universität zu Berlin, NFDI4Memory