🤖 AI Summary
Arabic recitation meter (‘Aroud’) identification suffers from a severe low-resource bottleneck, particularly due to the scarcity of annotated audio data for recited poetry. To address this, we propose the first end-to-end cross-modal framework for the task, integrating automatic speech recognition (ASR), rule-enhanced textual prosodic parsing, BERT-based sequence classification, and phoneme–prosody alignment. Crucially, we introduce a high-resource transfer paradigm and construct the first publicly available benchmark dataset for Arabic poetic prosody—directly mitigating the longstanding scarcity of prosodic annotations in Arabic. Evaluated on our curated test set, our model achieves 89.7% accuracy, outperforming the strongest baseline by 23.5 percentage points and significantly surpassing both unimodal approaches and traditional linguistic methods.
📝 Abstract
Arabic poetry is an essential and integral part of Arabic language and culture. It has been used by the Arabs to spot lights on their major events such as depicting brutal battles and conflicts. They also used it, as in many other languages, for various purposes such as romance, pride, lamentation, etc. Arabic poetry has received major attention from linguistics over the decades. One of the main characteristics of Arabic poetry is its special rhythmic structure as opposed to prose. This structure is referred to as a meter. Meters, along with other poetic characteristics, are intensively studied in an Arabic linguistic field called" extit{Aroud}". Identifying these meters for a verse is a lengthy and complicated process. It also requires technical knowledge in extit{Aruod}. For recited poetry, it adds an extra layer of processing. Developing systems for automatic identification of poem meters for recited poems need large amounts of labelled data. In this study, we propose a state-of-the-art framework to identify the poem meters of recited Arabic poetry, where we integrate two separate high-resource systems to perform the low-resource task. To ensure generalization of our proposed architecture, we publish a benchmark for this task for future research.