🤖 AI Summary
This work addresses the performance bottlenecks of large language models (LLMs) in understanding and classifying social norms within Iranian society—challenges arising from cultural specificity and the linguistic complexity of Persian. To this end, we introduce PSN (Persian Social Norms), the first high-quality, multidimensional Persian-language social norm dataset, comprising over 1,700 normative statements annotated with contextual scenarios, situational context, and culturally grounded labels, accompanied by expert English translations. Methodologically, we propose a novel pipeline integrating LLM-assisted generation, culturally informed prompt engineering, native-speaker collaborative validation, bilingual alignment, and structured semantic annotation—ensuring scalability, cultural authenticity, and ethical compliance. PSN is the first publicly available, reusable Persian social norm resource, substantially bridging a critical gap in culturally aware AI research. It serves as foundational infrastructure for evaluating, training, and modeling normative adaptation in cross-cultural AI systems.
📝 Abstract
Datasets capturing cultural norms are essential for developing globally aware AI systems. We present Persian Social Norms (PSN) a novel dataset of over 1.7k Persian social norms, including environments, contexts, and cultural labels, alongside English translations. Leveraging large language models and prompt-engineering techniques, we generated potential norms that were reviewed by native speakers for quality and ethical compliance. As the first Persian dataset of its kind, this resource enables computational modeling of norm adaptation, a crucial challenge for cross-cultural AI informed by diverse cultural perspectives.