🤖 AI Summary
This study addresses the challenge of guiding large language models to accurately transform abstract Persian proverbs into coherent narratives that faithfully preserve their underlying moral and causal structures, rather than producing fluent but semantically distorted text. To this end, we propose a constrained semantic decompression framework, introduce the PAND dataset—comprising proverbs, their paraphrased interpretations, and human-authored stories—and develop a hybrid evaluation protocol integrating human-calibrated LLM-as-a-Judge assessments, structured metrics, and multi-prompt strategy experiments. Our findings reveal a “decompression gap” in how large models concretize culturally grounded knowledge and demonstrate that explicit reasoning combined with iterative refinement significantly enhances narrative fidelity to the core semantics of the source proverbs.
📝 Abstract
Transforming a dense, abstract proverb into an engaging and morally faithful narrative requires deep cultural understanding and robust semantic grounding. We frame this problem as a \emph{constrained semantic decompression} task and study proverb-conditioned story generation as a testbed for abstraction-to-realization in large language models (LLMs). Focusing on Persian, we introduce the Proverb Aligned Narrative Dataset (PAND), pairing proverbs with human-written stories and explicit meanings. By a hybrid evaluation framework that combines human-calibrated LLM-as-a-Judge with structural metrics, we analyze model behavior across multiple prompting regimes. Our findings reveal a persistent \emph{decompression gap}: current LLMs often achieve strong surface-level fluency while failing to faithfully instantiate the underlying moral and causal structure encoded in proverbs. We further show that explicit reasoning and iterative refinement can partially mitigate these failures, suggesting that many decompression errors arise from difficulties in translating abstract meaning into narrative form rather than a complete lack of relevant knowledge. Our proposed task naturally extends to other forms of compressed cultural knowledge.