Automatic Classification of Arabic Literature into Historical Eras

📅 2026-01-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of automatically dating Arabic texts—particularly non-poetic prose—across historical periods, a task for which existing methods have shown limited efficacy. It presents the first systematic application of deep learning and natural language processing techniques to the automatic classification of Arabic prose spanning multiple historical eras, supporting classification schemes ranging from binary to fifteen-class periodization. Experiments conducted on two publicly available corpora, OpenITI and APCD, demonstrate strong performance, achieving F1 scores of 0.83 and 0.79, respectively, on binary dating tasks. Furthermore, the work establishes measurable baseline results for more complex multi-class historical periodization, thereby filling a significant gap in the scholarly literature on computational approaches to Arabic historical linguistics.

Technology Category

Application Category

📝 Abstract
The Arabic language has undergone notable transformations over time, including the emergence of new vocabulary, the obsolescence of others, and shifts in word usage. This evolution is evident in the distinction between the classical and modern Arabic eras. Although historians and linguists have partitioned Arabic literature into multiple eras, relatively little research has explored the automatic classification of Arabic texts by time period, particularly beyond the domain of poetry. This paper addresses this gap by employing neural networks and deep learning techniques to automatically classify Arabic texts into distinct eras and periods. The proposed models are evaluated using two datasets derived from two publicly available corpora, covering texts from the pre-Islamic to the modern era. The study examines class setups ranging from binary to 15-class classification and considers both predefined historical eras and custom periodizations. Results range from F1-scores of 0.83 and 0.79 on the binary-era classification task using the OpenITI and APCD datasets, respectively, to 0.20 on the 15-era classification task using OpenITI and 0.18 on the 12-era classification task using APCD.
Problem

Research questions and friction points this paper is trying to address.

Arabic literature
historical eras
automatic classification
text classification
temporal classification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Arabic historical text classification
neural networks
deep learning
temporal text classification
diachronic NLP
🔎 Similar Papers
No similar papers found.
Z
Zainab Alhathloul
Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran 31261, Saudi Arabia
Irfan Ahmad
Irfan Ahmad
King Fahd University of Petroleum and Minerals
Pattern RecognitionNatural Language ProcessingMachine LearningDocument Analysis