🤖 AI Summary
To address the challenges of multilingual, multi-dataset, and multi-label-schema named entity recognition (NER)—including both flat and nested entities—this paper proposes a unified NER framework. Methodologically, it leverages Transformer-based fine-tuning with multi-task joint training and cross-lingual transfer learning. Its key contributions include: (i) the first single large model (355M parameters) supporting flat NER across 17 languages and enabling cross-corpus/multi-tagset training; (ii) the first lightweight domain-specific model for Czech nested NER; and (iii) an open-source CLI tool and containerized cloud service (RESTful API), eliminating local deployment and supporting 15 languages. Evaluated on 21 benchmark datasets spanning 15 languages, the framework achieves state-of-the-art performance—outperforming many larger models—and has been integrated into the LINDAT platform, serving thousands of requests daily.
📝 Abstract
We introduce NameTag 3, an open-source tool and cloud-based web service for multilingual, multidataset, and multitagset named entity recognition (NER), supporting both flat and nested entities. NameTag 3 achieves state-of-the-art results on 21 test datasets in 15 languages and remains competitive on the rest, even against larger models. It is available as a command-line tool and as a cloud-based service, enabling use without local installation. NameTag 3 web service currently provides flat NER for 17 languages, trained on 21 corpora and three NE tagsets, all powered by a single 355M-parameter fine-tuned model; and nested NER for Czech, powered by a 126M fine-tuned model. The source code is licensed under open-source MPL 2.0, while the models are distributed under non-commercial CC BY-NC-SA 4.0. Documentation is available at https://ufal.mff.cuni.cz/nametag, source code at https://github.com/ufal/nametag3, and trained models via https://lindat.cz. The REST service and the web application can be found at https://lindat.mff.cuni.cz/services/nametag/. A demonstration video is available at https://www.youtube.com/watch?v=-gaGnP0IV8A.