🤖 AI Summary
This study identifies systematic deficiencies in large language models’ (LLMs) pragmatic reasoning capabilities—particularly in detecting presuppositions and conversational implicatures—within political discourse. To this end, we introduce IMPAQTS, the first Italian-language annotated corpus of political speeches explicitly labeled for fine-grained presuppositions and implicatures. We design two evaluation tasks: multiple-choice pragmatic inference and open-ended semantic explanation generation. Comprehensive benchmarking across mainstream LLMs reveals that all models significantly underperform relative to human annotators, with especially poor performance on presupposition detection and implicature resolution. Our work makes three key contributions: (1) the first publicly released, expert-annotated pragmatic corpus for political speech (IMPAQTS); (2) the first empirical demonstration of structural pragmatic deficits in LLMs within politically consequential contexts; and (3) a new open-source benchmark and resource to advance pragmatic modeling and AI-based assessment of political discourse.
📝 Abstract
Implicit content plays a crucial role in political discourse, where speakers systematically employ pragmatic strategies such as implicatures and presuppositions to influence their audiences. Large Language Models (LLMs) have demonstrated strong performance in tasks requiring complex semantic and pragmatic understanding, highlighting their potential for detecting and explaining the meaning of implicit content. However, their ability to do this within political discourse remains largely underexplored. Leveraging, for the first time, the large IMPAQTS corpus, which comprises Italian political speeches with the annotation of manipulative implicit content, we propose methods to test the effectiveness of LLMs in this challenging problem. Through a multiple-choice task and an open-ended generation task, we demonstrate that all tested models struggle to interpret presuppositions and implicatures. We conclude that current LLMs lack the key pragmatic capabilities necessary for accurately interpreting highly implicit language, such as that found in political discourse. At the same time, we highlight promising trends and future directions for enhancing model performance. We release our data and code at https://github.com/WalterPaci/IMPAQTS-PID