🤖 AI Summary
Existing hospital management systems lack real-time, accurate clinical decision support—particularly for low-resource languages like Arabic—and exhibit limited capability in handling informal, multi-dialect patient–physician dialogues. Method: We introduce the first Arabic-focused medical text generation system: (1) constructing the first large-scale, multi-dialect Arabic patient–physician dialogue dataset sourced from social media platforms; and (2) systematically adapting Mistral-7B-Instruct-v0.2, LLaMA-2-7B, and GPT-2 Medium via data cleaning, dialect-aware preprocessing, and instruction fine-tuning. Contribution/Results: Fine-tuned Mistral-7B achieves state-of-the-art performance on diagnostic suggestion and medication recommendation tasks, attaining a BERTScore F1 of 68.5—outperforming all baselines in coherence and relevance. This work is the first to systematically enhance the practicality and robustness of large language models for medical text generation in low-resource linguistic settings.
📝 Abstract
Efficient hospital management systems (HMS) are critical worldwide to address challenges such as overcrowding, limited resources, and poor availability of urgent health care. Existing methods often lack the ability to provide accurate, real-time medical advice, particularly for irregular inputs and underrepresented languages. To overcome these limitations, this study proposes an approach that fine-tunes large language models (LLMs) for Arabic medical text generation. The system is designed to assist patients by providing accurate medical advice, diagnoses, drug recommendations, and treatment plans based on user input. The research methodology required the collection of a unique dataset from social media platforms, capturing real-world medical conversations between patients and doctors. The dataset, which includes patient complaints together with medical advice, was properly cleaned and preprocessed to account for multiple Arabic dialects. Fine-tuning state-of-the-art generative models, such as Mistral-7B-Instruct-v0.2, LLaMA-2-7B, and GPT-2 Medium, optimized the system's ability to generate reliable medical text. Results from evaluations indicate that the fine-tuned Mistral-7B model outperformed the other models, achieving average BERT (Bidirectional Encoder Representations from Transformers) Score values in precision, recall, and F1-scores of 68.5%, 69.08%, and 68.5%, respectively. Comparative benchmarking and qualitative assessments validate the system's ability to produce coherent and relevant medical replies to informal input. This study highlights the potential of generative artificial intelligence (AI) in advancing HMS, offering a scalable and adaptable solution for global healthcare challenges, especially in linguistically and culturally diverse environments.