🤖 AI Summary
This work addresses computational labor market challenges by tackling two core tasks: multilingual job title matching and job-title-based skill prediction—key to improving candidate-job matching, career path recommendation, and labor market analytics. Methodologically, it systematically compares three paradigms—discriminative fine-tuning, contrastive fine-tuning, and large language model prompting—and introduces, for the first time, the ESCO multilingual occupation-skill knowledge base to enrich semantic representations. Experimental results demonstrate that the largest multilingual foundation model achieves state-of-the-art performance on both tasks: mean Average Precision (mAP) of 0.492 for job title matching (5th place in the competition track) and 0.290 for skill prediction (3rd place). The study establishes a reproducible benchmark and practical technical framework for cross-lingual occupational semantic understanding.
📝 Abstract
Matching job titles is a highly relevant task in the computational job market domain, as it improves e.g., automatic candidate matching, career path prediction, and job market analysis. Furthermore, aligning job titles to job skills can be considered an extension to this task, with similar relevance for the same downstream tasks. In this report, we outline NLPnorth's submission to TalentCLEF 2025, which includes both of these tasks: Multilingual Job Title Matching, and Job Title-Based Skill Prediction. For both tasks we compare (fine-tuned) classification-based, (fine-tuned) contrastive-based, and prompting methods. We observe that for Task A, our prompting approach performs best with an average of 0.492 mean average precision (MAP) on test data, averaged over English, Spanish, and German. For Task B, we obtain an MAP of 0.290 on test data with our fine-tuned classification-based approach. Additionally, we made use of extra data by pulling all the language-specific titles and corresponding emph{descriptions} from ESCO for each job and skill. Overall, we find that the largest multilingual language models perform best for both tasks. Per the provisional results and only counting the unique teams, the ranking on Task A is 5$^{ ext{th}}$/20 and for Task B 3$^{ ext{rd}}$/14.