🤖 AI Summary
This study addresses the UniDive 2025 morphosyntactic parsing shared task by proposing a joint multi-task framework that unifies modeling of morphological features, dependency syntactic structure, and content-word identification across nine languages. Methodologically, it employs XLM-RoBERTa as a shared encoder and couples three specialized decoders—one for each task—augmented with fine-grained error analysis and ablation studies. The key contributions are: (i) the first end-to-end joint prediction of morphology, syntax, and content words under the latest Universal Dependencies (UD) annotation scheme, significantly improving cross-lingual consistency; and (ii) empirical validation that content-word identification provides critical performance gains. Experiments show the model achieves average MS-LAS of 78.7%, LAS of 80.1%, and Feats F1 of 90.3% across all nine languages, outperforming prior systems in overall performance.
📝 Abstract
We present a joint multitask model for the UniDive 2025 Morpho-Syntactic Parsing shared task, where systems predict both morphological and syntactic analyses following novel UD annotation scheme. Our system uses a shared XLM-RoBERTa encoder with three specialized decoders for content word identification, dependency parsing, and morphosyntactic feature prediction. Our model achieves the best overall performance on the shared task's leaderboard covering nine typologically diverse languages, with an average MSLAS score of 78.7 percent, LAS of 80.1 percent, and Feats F1 of 90.3 percent. Our ablation studies show that matching the task's gold tokenization and content word identification are crucial to model performance. Error analysis reveals that our model struggles with core grammatical cases (particularly Nom-Acc) and nominal features across languages.