🤖 AI Summary
This work addresses the absence of culturally and acoustically nuanced evaluation benchmarks for audio language models in Persian, particularly regarding classical poetic meter (vazn), traditional music, and code-switching. To bridge this gap, we introduce the first multimodal audio–language benchmark for Persian, comprising 16 tasks—including 10 newly proposed ones—and over 8,000 high-quality human-annotated samples, with a focus on spoken language understanding, paralinguistic analysis, and cultural context modeling. Experimental results reveal that current models perform near chance level on culturally grounded prosody-dependent tasks such as vazn detection, and audio-based models show no significant advantage over text-only baselines, indicating insufficient utilization of acoustic information. These findings underscore the benchmark’s critical role in advancing culturally aware audio language models.
📝 Abstract
Persian poses unique audio understanding challenges through its classical poetry, traditional music, and pervasive code-switching - none captured by existing benchmarks. We introduce PARSA-Bench (Persian Audio Reasoning and Speech Assessment Benchmark), the first benchmark for evaluating large audio-language models on Persian language and culture, comprising 16 tasks and over 8,000 samples across speech understanding, paralinguistic analysis, and cultural audio understanding. Ten tasks are newly introduced, including poetry meter and style detection, traditional Persian music understanding, and code-switching detection. Text-only baselines consistently outperform audio counterparts, suggesting models may not leverage audio-specific information beyond what transcription alone provides. Culturally-grounded tasks expose a qualitatively distinct failure mode: all models perform near random chance on vazn detection regardless of scale, suggesting prosodic perception remains beyond the reach of current models. The dataset is publicly available at https://huggingface.co/datasets/MohammadJRanjbar/PARSA-Bench