How Much Do LLMs Know About Chinese Zero Pronouns?

📅 2026-05-29
📈 Citations: 0
Influential: 0
📄 PDF

career value

166K/year
🤖 AI Summary
This study addresses the long-standing challenge of Chinese zero pronoun (ZP) understanding in natural language processing, an area where the capabilities of current large language models (LLMs) remain unclear. The authors propose the first comprehensive multi-task evaluation framework that spans the full ZP comprehension pipeline, grounded in linguistic theory and encompassing five tasks: identification, referentiality classification, reference type classification, resolution, and translation. Through systematic evaluation of mainstream LLMs, the experiments reveal significant limitations: models perform poorly on upstream tasks such as ZP identification and referentiality judgment, and even state-of-the-art reasoning models achieve less than 50% accuracy on Chinese-to-English ZP translation. These findings underscore a critical gap in LLMs’ ability to capture the deep semantic structure underlying Chinese zero pronouns.
📝 Abstract
Zero Pronouns (ZPs) are a pervasive linguistic phenomenon in pro-drop languages such as Chinese and have long posed a challenge for natural language processing systems. Although Large Language Models (LLMs) perform well on many Chinese language tasks, their ability to process ZPs remains poorly understood. We conduct a systematic investigation of LLMs' handling of Chinese ZPs through a sequence of linguistically motivated tasks, including identification, referentiality classification, referential type classification, resolution, and translation. A diverse set of LLMs is evaluated across all tasks. Our results show that Chinese ZPs remain highly challenging for current LLMs, particularly for upstream tasks such as identification and referentiality classification. Performance on downstream tasks, such as ZP translation, is also consistently low: even state-of-the-art reasoning-oriented LLMs correctly translate fewer than half of Chinese ZPs into English.
Problem

Research questions and friction points this paper is trying to address.

Zero Pronouns
Chinese
Large Language Models
Natural Language Processing
Pro-drop Languages
Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero Pronouns
Large Language Models
Chinese NLP
Pronoun Resolution
Cross-lingual Translation
🔎 Similar Papers
No similar papers found.
Y
Yifei Li
Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, National Language Resources Monitoring and Research Center for Network Media, School of Computer Science, Central China Normal University
Guanyi Chen
Guanyi Chen
Central China Normal University
Computational LinguisticsNatural Language GenerationComputational Pragmatics
T
Tingting He
Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, National Language Resources Monitoring and Research Center for Network Media, School of Computer Science, Central China Normal University