Encoder vs Decoder: Comparative Analysis of Encoder and Decoder Language Models on Multilingual NLU Tasks

📅 2024-06-19
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Decoder-only language models (LMs) are increasingly deployed for natural language understanding (NLU), yet their performance remains poorly characterized across multilingual, low-resource settings—particularly within the Germanic language family. Method: This study systematically benchmarks encoder and decoder LMs on eight Germanic languages using a novel prompt-based unified evaluation framework, designed to overcome decoder-model incompatibility with standard NLU protocols; we extend the ScandEval benchmark to support decoder evaluation and employ UMAP for geometric capability visualization. Results: Encoder models with 1–2 orders of magnitude fewer parameters consistently outperform large decoder models across most NLU tasks. Task type and language-specific data quality emerge as key moderators of architectural disparity. The work establishes the first standardized, multilingual NLU evaluation paradigm tailored for decoder LMs and provides empirical evidence that architectural choice fundamentally constrains low-resource language understanding capabilities.

Technology Category

Application Category

📝 Abstract
This paper explores the performance of encoder and decoder language models on multilingual Natural Language Understanding (NLU) tasks, with a broad focus on Germanic languages. Building upon the ScandEval benchmark, initially restricted to evaluating encoder models, we extend the evaluation framework to include decoder models. We introduce a method for evaluating decoder models on NLU tasks and apply it to the languages Danish, Swedish, Norwegian, Icelandic, Faroese, German, Dutch, and English. Through a series of experiments and analyses, we also address research questions regarding the comparative performance of encoder and decoder models, the impact of NLU task types, and the variation across language resources. Our findings reveal that encoder models can achieve significantly better NLU performance than decoder models despite having orders of magnitude fewer parameters. Additionally, we investigate the correlation between decoders and task performance via a UMAP analysis, shedding light on the unique capabilities of decoder and encoder models. This study contributes to a deeper understanding of language model paradigms in NLU tasks and provides valuable insights for model selection and evaluation in multilingual settings.
Problem

Research questions and friction points this paper is trying to address.

Multilingual Natural Language Understanding
Encoder-Decoder Language Models
Germanic Languages
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decoder Models Evaluation
Multilingual Environment
UMAP Analysis in NLU
🔎 Similar Papers
No similar papers found.