Natural Language Processing for Electronic Health Records in Scandinavian Languages: Norwegian, Swedish, and Danish

📅 2025-03-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Clinical natural language processing (NLP) for Scandinavian languages remains underexplored, with no systematic, cross-lingual assessment of progress, resource availability, or methodological trends. Method: We conducted a systematic review of 113 peer-reviewed studies (2010–2024) from PubMed, ACL Anthology, IEEE Xplore, Scopus, and Web of Science, focusing on Norwegian, Swedish, and Danish clinical text processing. We quantitatively analyzed model adoption, task coverage, and resource sharing across the three languages. Results: Swedish dominates the field (72% of studies), while Norwegian (18%) and Danish (10%) lag significantly—especially in critical tasks like de-identification and in adopting Transformer-based models. Data, code, and pretrained model sharing rates are extremely low, hindering regional reproducibility and collaboration. We further evaluated rule-based systems, classical machine learning, and BERT-family models on EHR text, identifying persistent adaptation bottlenecks and limited cross-lingual transferability. This study provides the first empirical evidence of structural imbalance in Scandinavian clinical NLP and offers actionable insights for equitable, multilingual health AI resource development.

Technology Category

Application Category

📝 Abstract
Background: Clinical natural language processing (NLP) refers to the use of computational methods for extracting, processing, and analyzing unstructured clinical text data, and holds a huge potential to transform healthcare in various clinical tasks. Objective: The study aims to perform a systematic review to comprehensively assess and analyze the state-of-the-art NLP methods for the mainland Scandinavian clinical text. Method: A literature search was conducted in various online databases including PubMed, ScienceDirect, Google Scholar, ACM digital library, and IEEE Xplore between December 2022 and February 2024. Further, relevant references to the included articles were also used to solidify our search. The final pool includes articles that conducted clinical NLP in the mainland Scandinavian languages and were published in English between 2010 and 2024. Results: Out of the 113 articles, 18% (n=21) focus on Norwegian clinical text, 64% (n=72) on Swedish, 10% (n=11) on Danish, and 8% (n=9) focus on more than one language. Generally, the review identified positive developments across the region despite some observable gaps and disparities between the languages. There are substantial disparities in the level of adoption of transformer-based models. In essential tasks such as de-identification, there is significantly less research activity focusing on Norwegian and Danish compared to Swedish text. Further, the review identified a low level of sharing resources such as data, experimentation code, pre-trained models, and rate of adaptation and transfer learning in the region. Conclusion: The review presented a comprehensive assessment of the state-of-the-art Clinical NLP for electronic health records (EHR) text in mainland Scandinavian languages and, highlighted the potential barriers and challenges that hinder the rapid advancement of the field in the region.
Problem

Research questions and friction points this paper is trying to address.

Assessing NLP methods for Scandinavian clinical texts
Identifying disparities in NLP adoption across languages
Evaluating resource sharing in clinical NLP research
Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic review of Scandinavian clinical NLP
Transformer-based models adoption disparities
Low resource sharing in clinical NLP
🔎 Similar Papers
No similar papers found.
Ashenafi Zebene Woldaregay
Ashenafi Zebene Woldaregay
Researcher (Ph.D), Norwegian Centre for Clinical Artificial Intelligence (SPKI)
Deep LearningNLPMachine LearningHealth InformaticsData Science
J
Jorgen Aarmo Lund
Department of Physics and Technology, UiT The Arctic University of Norway, Hansine Hansens veg 18, 9019 Tromsø, Norway
Phuong Dinh Ngo
Phuong Dinh Ngo
Norwegian Centre for E-health Research
Intelligent controlmachine learninghealth analytics
M
Mariyam Tayefi
Norwegian Centre for E-health Research, Tromsø, Norway
J
Joel Burman
Norwegian Centre for Clinical Artificial Intelligence (SPKI), University Hospital of Northern Norway, Tromsø, Norway
Stine Hansen
Stine Hansen
UiT The Arctic University of Norway
Machine LearningDeep LearningMedical Image AnalysisComputer Vision
M
Martin Hylleholt Sillesen
Department of Surgery and Transplantation C-TX, Copenhagen University Hospital, Rigshospitalet, Copenhagen, Denmark
Hercules Dalianis
Hercules Dalianis
Professor in Computer and Systems Science, Stockholm University, Sweden
Clinical text mininginformation retrievaltext generationautomatic text summarisation
Robert Jenssen
Robert Jenssen
Visual Intelligence, UiT The Arctic University of Norway & Norw. Comp. Center & P1 Centre AI, UCPH
Machine learninginformation theoretic learningkernel methodsdeep learninghealth data analytics
L
Lindsetmo Rolf Ole
Clinic of Surgery, Oncology and Women Health, University Hospital of North Norway, Tromso, Norway
K
Karl Oyvind Mikalsen
Norwegian Centre for Clinical Artificial Intelligence (SPKI), University Hospital of Northern Norway, Tromsø, Norway