🤖 AI Summary
This work addresses the challenges of time-consuming manual composition and inconsistent interpretation in chest X-ray reporting by proposing an off-policy reinforcement learning-based encoder-decoder model. The architecture integrates a pretrained DenseNet visual encoder with a multi-layer LSTM language decoder and incorporates a dual-network structure alongside a metric-driven reward mechanism to optimize alignment between visual and semantic embeddings. This approach significantly enhances the fine-grained accuracy and clinical coherence of generated reports. On the IU-Xray dataset, it achieves absolute improvements of 0.47% in BLEU-4, 0.17% in METEOR, and 0.518% in ROUGE-L scores, while demonstrating strong generalization performance on the MIMIC-CXR dataset.
📝 Abstract
Medical imaging interpretation is a foundational pillar of modern clinical diagnostics, yet the manual generation of radiology reports remains a time-consuming process prone to interpretation inconsistencies. Within the field of medical AI, automating these descriptions through deep learning promises to streamline clinical workflows and standardise diagnostic output. However, accurate disease detection and precise report generation remain significant challenges due to limitations in capturing fine-grained visual features and ensuring clinical coherence. To address these issues, we propose RL-ACRGNet, an improved encoder-decoder model that integrates a pre-trained DenseNet encoder with a multilevel LSTM decoder within an off-policy reinforcement learning framework. Using a dual-network approach to refine visual-semantic embeddings through a metric-based reward mechanism, we demonstrate that RL-ACRGNet consistently outperforms state-of-the-art baselines on the IU-Xray dataset, achieving quantitative improvements in BLEU-4 (0.47%), METEOR (0.17%) and ROUGE-L (0.518). Furthermore, comprehensive evaluations on the large-scale MIMIC-CXR data set confirm the robust generalisation of the model and its ability to generate high-quality, clinically relevant reports