Severity-Aware Curriculum Learning with Multi-Model Response Selection for Medical Text Generation

📅 2026-06-03

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

This work addresses the challenge that current large language models struggle to generate consistent and contextually appropriate responses across medical inquiries of varying severity. To overcome this limitation, the authors propose a novel framework integrating severity-aware curriculum learning with multi-model response selection. The approach employs a three-stage progressive training regimen—spanning mild, moderate, and critical cases—to enhance the model’s adaptability to clinical complexity. During inference, the system selects the optimal response from candidates generated by five independently trained models based on contextual relevance. This study is the first to combine severity-driven curriculum learning with a multi-model selection mechanism, achieving a BERTScore of 90.30% on the MAQA dataset—significantly outperforming existing baselines and fine-tuned models.

📝 Abstract

Telehealth systems have become increasingly important for delivering accessible and timely medical information. Existing large language models often struggle to provide consistent and contextually appropriate medical responses across varying levels of case severity. This limitation highlights the need for models that can effectively adapt to the progressive complexity in medical queries. To address this challenge, we introduce a severity-aware multi-model framework that integrates curriculum training strategy with relevance-based response selection. The proposed framework employs a three-stage curriculum learning strategy, where each model is trained sequentially on mild, moderate, and critical cases to progressively acquire domain knowledge. The approach utilizes five large language models, each independently trained under the same curriculum scheme. During inference, all models generate candidate responses, and the most appropriate response is selected as the final output. The framework is trained and evaluated on the MAQA dataset, which provides annotated medical question-answer pairs. Experimental results evaluated using BERTScore demonstrate that the proposed method achieves superior performance compared to both baseline and fine-tuned models, attaining 86.71% in the baseline setting and 90.30% after fine-tuning. These results highlight the effectiveness of combining curriculum learning with multi-model response selection in improving response quality and relevance in medical text generation.

Problem

Research questions and friction points this paper is trying to address.

medical text generation

case severity

response consistency

contextual appropriateness

large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

severity-aware

curriculum learning

multi-model response selection