Decomposed Prompting: Probing Multilingual Linguistic Structure Knowledge in Large Language Models

📅 2024-02-28

📈 Citations: 0

✨ Influential: 0

career value

147K/year

🤖 AI Summary

Existing text-to-text prompting approaches exhibit template instability when probing multilingual syntactic knowledge. To address this, we propose Token-Level Independent Prompting (TLIP), a novel paradigm that generates dedicated prompts for each token in the input sentence and directly predicts its linguistic label—thereby eliminating reliance on fixed templates. TLIP is the first prompting framework to operate at the subword or word level, enabling fine-grained probing of syntactic representations. We systematically evaluate the cross-lingual syntactic knowledge transfer capability of English-centric large language models (LLMs) across 38 languages on part-of-speech (POS) tagging, using the Universal Dependencies dataset. Our experiments incorporate zero-shot and few-shot prompting, along with multi-model comparative analysis. Results demonstrate that TLIP significantly outperforms iterative prompting baselines in both accuracy and inference efficiency, revealing critical boundaries and untapped potential in current LLMs’ multilingual syntactic representations.

Technology Category

Application Category

📝 Abstract

Probing the multilingual knowledge of linguistic structure in LLMs, often characterized as sequence labeling, faces challenges with maintaining output templates in current text-to-text prompting strategies. To solve this, we introduce a decomposed prompting approach for sequence labeling tasks. Diverging from the single text-to-text prompt, our prompt method generates for each token of the input sentence an individual prompt which asks for its linguistic label. We test our method on the Universal Dependencies part-of-speech tagging dataset for 38 languages, using both English-centric and multilingual LLMs. Our findings show that decomposed prompting surpasses the iterative prompting baseline in efficacy and efficiency under zero- and few-shot settings. Moreover, our analysis of multilingual performance of English-centric LLMs yields insights into the transferability of linguistic knowledge via multilingual prompting.

Problem

Research questions and friction points this paper is trying to address.

Improves sequence labeling tasks in multilingual linguistic structure probing

Addresses output template consistency issues in text-to-text prompting

Enhances cross-lingual transfer of linguistic knowledge in LLMs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposed prompting replaces single text-to-text prompts

Generates individual prompts for each token's linguistic label

Improves multilingual sequence labeling in zero- and few-shot settings

🔎 Similar Papers

Exploring Multilingual Probing in Large Language Models: A Cross-Language Analysis