Homa at SemEval-2025 Task 5: Aligning Librarian Records with OntoAligner for Subject Tagging

๐Ÿ“… 2025-04-30
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the automatic subject indexing of German technical literature records from the TIBKAT system, aligning them with the German authority classification scheme Gemeinsame Normdatei (GND). We propose a novel cross-lingual ontology alignment paradigm: for the first time, we adapt the OntoAligner framework to subject indexing by formalizing label assignment as a semantic alignment task between GND concepts and document descriptions. Our approach integrates multilingual semantic embeddings, retrieval-augmented generation (RAG)-enhanced candidate retrieval, and fine-grained similarity matching. Evaluated on SemEval-2025 Task 5, our method achieves significant improvements in GND category matching accuracy. It demonstrates strong robustness on Germanโ€“English mixed records and high cross-lingual transferability. The framework provides a scalable, language-agnostic solution for automated knowledge organization of multilingual scientific literature.

Technology Category

Application Category

๐Ÿ“ Abstract
This paper presents our system, Homa, for SemEval-2025 Task 5: Subject Tagging, which focuses on automatically assigning subject labels to technical records from TIBKAT using the Gemeinsame Normdatei (GND) taxonomy. We leverage OntoAligner, a modular ontology alignment toolkit, to address this task by integrating retrieval-augmented generation (RAG) techniques. Our approach formulates the subject tagging problem as an alignment task, where records are matched to GND categories based on semantic similarity. We evaluate OntoAligner's adaptability for subject indexing and analyze its effectiveness in handling multilingual records. Experimental results demonstrate the strengths and limitations of this method, highlighting the potential of alignment techniques for improving subject tagging in digital libraries.
Problem

Research questions and friction points this paper is trying to address.

Automatically assigning subject labels to technical records
Aligning records to GND taxonomy using semantic similarity
Evaluating OntoAligner for multilingual subject indexing
Innovation

Methods, ideas, or system contributions that make the work stand out.

OntoAligner toolkit for ontology alignment
Retrieval-augmented generation (RAG) techniques
Semantic similarity for GND category matching
๐Ÿ”Ž Similar Papers
No similar papers found.
H
Hadi Bayrami Asl Tekanlou
University of Tabriz, Tabriz, Iran
J
J. Razmara
University of Tabriz, Tabriz, Iran
M
Mahsa Sanaei
University of Tabriz, Tabriz, Iran
M
Mostafa Rahgouy
Auburn University, Alabama, USA
Hamed Babaei Giglou
Hamed Babaei Giglou
TIB โ€” Leibniz Information Centre for Science and Technology
NLPLLMsReinforcement LearningOntology EngineeringSemantic Web