🤖 AI Summary
Existing text-to-text prompting approaches exhibit template instability when probing multilingual syntactic knowledge. To address this, we propose Token-Level Independent Prompting (TLIP), a novel paradigm that generates dedicated prompts for each token in the input sentence and directly predicts its linguistic label—thereby eliminating reliance on fixed templates. TLIP is the first prompting framework to operate at the subword or word level, enabling fine-grained probing of syntactic representations. We systematically evaluate the cross-lingual syntactic knowledge transfer capability of English-centric large language models (LLMs) across 38 languages on part-of-speech (POS) tagging, using the Universal Dependencies dataset. Our experiments incorporate zero-shot and few-shot prompting, along with multi-model comparative analysis. Results demonstrate that TLIP significantly outperforms iterative prompting baselines in both accuracy and inference efficiency, revealing critical boundaries and untapped potential in current LLMs’ multilingual syntactic representations.
📝 Abstract
Probing the multilingual knowledge of linguistic structure in LLMs, often characterized as sequence labeling, faces challenges with maintaining output templates in current text-to-text prompting strategies. To solve this, we introduce a decomposed prompting approach for sequence labeling tasks. Diverging from the single text-to-text prompt, our prompt method generates for each token of the input sentence an individual prompt which asks for its linguistic label. We test our method on the Universal Dependencies part-of-speech tagging dataset for 38 languages, using both English-centric and multilingual LLMs. Our findings show that decomposed prompting surpasses the iterative prompting baseline in efficacy and efficiency under zero- and few-shot settings. Moreover, our analysis of multilingual performance of English-centric LLMs yields insights into the transferability of linguistic knowledge via multilingual prompting.