Untangling the Influence of Typology, Data and Model Architecture on Ranking Transfer Languages for Cross-Lingual POS Tagging

📅 2025-03-25

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This work investigates key factors influencing zero-shot language selection for cross-lingual part-of-speech tagging. Focusing on pre-trained multilingual models (mBERT and XLM-R), we systematically model the interplay among linguistic typological features (from WALS and URIEL), data-driven statistical features (word overlap ratio, type–token ratio, and genealogical distance), and model architecture. To our knowledge, this is the first study to jointly learn the impact of fine-grained typological and data-driven features on transfer ranking within modern multilingual pretraining frameworks, revealing that these feature classes are both complementary and individually effective. Experiments demonstrate that word overlap ratio, type–token ratio, and genealogical distance are the most architecture-robust predictors across mBERT and XLM-R. A ranker integrating both feature types significantly improves zero-shot transfer accuracy; notably, even rankers relying solely on either typological or statistical features achieve strong performance.

Technology Category

Application Category

📝 Abstract

Cross-lingual transfer learning is an invaluable tool for overcoming data scarcity, yet selecting a suitable transfer language remains a challenge. The precise roles of linguistic typology, training data, and model architecture in transfer language choice are not fully understood. We take a holistic approach, examining how both dataset-specific and fine-grained typological features influence transfer language selection for part-of-speech tagging, considering two different sources for morphosyntactic features. While previous work examines these dynamics in the context of bilingual biLSTMS, we extend our analysis to a more modern transfer learning pipeline: zero-shot prediction with pretrained multilingual models. We train a series of transfer language ranking systems and examine how different feature inputs influence ranker performance across architectures. Word overlap, type-token ratio, and genealogical distance emerge as top features across all architectures. Our findings reveal that a combination of typological and dataset-dependent features leads to the best rankings, and that good performance can be obtained with either feature group on its own.

Problem

Research questions and friction points this paper is trying to address.

Understanding roles of typology, data, and model in transfer language selection

Examining dataset-specific and typological features for POS tagging transfer

Evaluating feature influence on transfer ranking across different architectures

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses zero-shot prediction with pretrained multilingual models

Combines typological and dataset-dependent features for ranking

Identifies word overlap, type-token ratio, genealogical distance as key features

🔎 Similar Papers

Cross-lingual Character-Level Neural Morphological Tagging