Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set

📅 2025-03-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates whether multilingual large language models (LLMs) possess cross-lingual and cross-annotation-framework generalization capability for discourse relations. To this end, we first construct a unified multilingual discourse relation taxonomy and conduct systematic probing experiments across 23 multilingual LLMs with varying linguistic capabilities. Our methodology integrates layer-wise feature extraction, cross-lingual label mapping, and discourse relation classification evaluation. Results show that intermediate hidden-layer representations exhibit the highest language invariance and serve as the primary carrier for cross-lingual discourse knowledge transfer; multilingual pretraining significantly enhances the transferability of discourse abstraction; yet semantically complex relations—such as causality and concession—remain challenging to classify accurately. This work provides a verifiable empirical foundation and a principled methodological framework for understanding the deep discourse modeling mechanisms of LLMs.

Technology Category

Application Category

📝 Abstract
Discourse understanding is essential for many NLP tasks, yet most existing work remains constrained by framework-dependent discourse representations. This work investigates whether large language models (LLMs) capture discourse knowledge that generalizes across languages and frameworks. We address this question along two dimensions: (1) developing a unified discourse relation label set to facilitate cross-lingual and cross-framework discourse analysis, and (2) probing LLMs to assess whether they encode generalizable discourse abstractions. Using multilingual discourse relation classification as a testbed, we examine a comprehensive set of 23 LLMs of varying sizes and multilingual capabilities. Our results show that LLMs, especially those with multilingual training corpora, can generalize discourse information across languages and frameworks. Further layer-wise analyses reveal that language generalization at the discourse level is most salient in the intermediate layers. Lastly, our error analysis provides an account of challenging relation classes.
Problem

Research questions and friction points this paper is trying to address.

Investigates LLMs' ability to generalize discourse knowledge across languages and frameworks.
Develops a unified discourse relation label set for cross-lingual and cross-framework analysis.
Probes LLMs to assess encoding of generalizable discourse abstractions in multilingual contexts.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed unified discourse relation label set
Probed LLMs for cross-lingual discourse generalization
Analyzed layer-wise discourse generalization in LLMs
🔎 Similar Papers
No similar papers found.