Team Fusion@ SU@ BC8 SympTEMIST track: transformer-based approach for symptom recognition and linking

📅 2026-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the tasks of symptom named entity recognition (NER) and entity linking (EL) by proposing a Transformer-based joint approach. The method enhances symptom identification through fine-tuning RoBERTa integrated with a BiLSTM-CRF architecture and performs entity linking by leveraging cross-lingual SapBERT to generate candidate entities, followed by cosine similarity matching against a knowledge base. Experimental results on the SympTEMIST dataset demonstrate high accuracy in both symptom recognition and linking, with particular emphasis on the decisive impact of knowledge base selection on EL performance. The findings underscore the critical role of domain-specific knowledge integration in advancing clinical text understanding.
📝 Abstract
This paper presents a transformer-based approach to solving the SympTEMIST named entity recognition (NER) and entity linking (EL) tasks. For NER, we fine-tune a RoBERTa-based (1) token-level classifier with BiLSTM and CRF layers on an augmented train set. Entity linking is performed by generating candidates using the cross-lingual SapBERT XLMR-Large (2), and calculating cosine similarity against a knowledge base. The choice of knowledge base proves to have the highest impact on model accuracy.
Problem

Research questions and friction points this paper is trying to address.

symptom recognition
named entity recognition
entity linking
SympTEMIST
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer-based NER
Entity Linking
RoBERTa-BiLSTM-CRF
Cross-lingual SapBERT
Knowledge Base Integration
🔎 Similar Papers
No similar papers found.
G
Georgi Grazhdanski
FMI, Sofia University St. Kliment Ohridski
S
Sylvia Vassileva
FMI, Sofia University St. Kliment Ohridski
I
Ivan Koychev
FMI, Sofia University St. Kliment Ohridski
Svetla Boytcheva
Svetla Boytcheva
Ontotext
Artificial IntelligenceComputational LinguisticsMedical InformaticsMachine Learning