A Severity-Based Curriculum Learning Strategy for Arabic Medical Text Generation

📅 2026-04-07

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This study addresses a critical limitation in existing Arabic medical text generation methods, which overlook variations in symptom severity and consequently struggle with high-risk, complex cases. To bridge this gap, the work introduces clinical severity as a novel dimension to the task and proposes a curriculum learning framework grounded in a three-tier severity annotation scheme—mild, moderate, and critical. Severity labels are assigned via a rule-driven approach, and a staged training strategy progressively guides the model from simpler to more challenging cases. Evaluated on a subset of the MAQA dataset, the proposed method outperforms baseline models by 4%–7% and surpasses conventional fine-tuning by 3%–6%, demonstrating significantly enhanced capability in generating accurate and contextually appropriate descriptions for complex clinical scenarios.

📝 Abstract

Arabic medical text generation is increasingly needed to help users interpret symptoms and access general health guidance in their native language. Nevertheless, many existing methods assume uniform importance across training samples, overlooking differences in clinical severity. This simplification can hinder the model's ability to properly capture complex or high-risk cases. To overcome this issue, this work introduces a Severity-based Curriculum Learning Strategy for Arabic Medical Text Generation, where the training process is structured to move gradually from less severe to more critical medical conditions. The approach divides the dataset into ordered stages based on severity and incrementally exposes the model to more challenging cases during fine-tuning, allowing it to first learn basic medical patterns before addressing more complex scenarios. The proposed method is evaluated on a subset of the Medical Arabic Question Answering (MAQA) dataset, which includes Arabic medical questions describing symptoms alongside corresponding responses. In addition, the dataset is annotated with three severity levels (Mild, Moderate, and Critical) using a rule-based method developed in this study. The results demonstrate that incorporating severity-aware curriculum learning leads to consistent performance improvements across all tested models, with gains of around +4% to +7% over baseline models and +3% to +6% compared with conventional fine-tuning approaches.

Problem

Research questions and friction points this paper is trying to address.

Arabic medical text generation

clinical severity

curriculum learning

severity-aware training

medical NLP

Innovation

Methods, ideas, or system contributions that make the work stand out.

Severity-based Curriculum Learning

Arabic Medical Text Generation

Clinical Severity Annotation

Progressive Fine-tuning

MAQA Dataset

🔎 Similar Papers

No similar papers found.

Authors to Follow