Modular Monolingual Adaptation using Pretrained Language Models

📅 2026-06-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the inefficiency of full-model fine-tuning in monolingual language model development for low-resource languages, which incurs high computational costs and fails to leverage modular adaptation effectively. To overcome this limitation, the authors propose an efficient transfer learning approach that integrates a target-language-specific tokenizer, freezes the corresponding embedding layer, and fine-tunes only the remaining model parameters—departing from conventional full-parameter fine-tuning paradigms. Evaluated on Scottish Gaelic, Irish, and Quechua (with only 8.5k training samples), the method consistently outperforms baseline approaches across masked language modeling, named entity recognition, and part-of-speech tagging tasks, demonstrating its effectiveness and generalizability under extremely low-resource conditions.
📝 Abstract
Building monolingual language models (LMs) for low-resource languages typically relies on adapting pretrained language models (PLMs) by finetuning the whole model on the target language. This approach is widely favored over training from scratch, as it enables effective knowledge transfer. Additionally, prior work has shown that using a language-specific tokenizer can enhance the adaptability. In this work, we hypothesize that full model tuning is often unnecessary and propose a more modular approach. Specifically, we replace the tokens, freeze the corresponding embeddings, and tune the rest of the model. We use Scottish Gaelic, Irish, and Quechua for our experiments, with Quechua being a very low-resource language (8.5k training instances). Evaluation on natural language understanding (NLU) tasks -- mask filling, NER, and POS -- shows that our proposed approach improves performance when adapting models to low-resource languages. Additionally, we provide a comprehensive analysis of the effectiveness of training strategies, the choice of pretrained embeddings, and models.
Problem

Research questions and friction points this paper is trying to address.

low-resource languages
pretrained language models
monolingual adaptation
natural language understanding
model adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

modular adaptation
low-resource languages
pretrained language models
tokenizer replacement
frozen embeddings
🔎 Similar Papers
No similar papers found.