🤖 AI Summary
To address degraded ASR performance in Greek medical dictation—caused by domain-specific terminology, high phonetic variability, and nonstandard orthography—this paper proposes an acoustic-language co-optimized end-to-end speech recognition framework. Methodologically: (1) a Greek medical-domain acoustic model is built and fine-tuned on annotated medical speech data; (2) a lightweight text correction module is integrated with the ASR decoder and jointly optimized to explicitly model lexical variants of medical terms and clinical expression conventions. Experiments on real-world clinical dictation data show a 32.7% relative reduction in word error rate (WER) over generic models, with 94.1% accuracy for medical term recognition. This work constitutes the first systematic effort toward domain-adaptive modeling and joint correction optimization for Greek medical ASR, significantly improving transcription accuracy and clinical utility. It establishes a reusable technical pathway for low-resource medical language technologies.
📝 Abstract
Medical dictation systems are essential tools in modern healthcare, enabling accurate and efficient conversion of speech into written medical documentation. The main objective of this paper is to create a domain-specific system for Greek medical speech transcriptions. The ultimate goal is to assist healthcare professionals by reducing the overload of manual documentation and improving workflow efficiency. Towards this goal, we develop a system that combines automatic speech recognition techniques with text correction model, allowing better handling of domain-specific terminology and linguistic variations in Greek. Our approach leverages both acoustic and textual modeling to create more realistic and reliable transcriptions. We focused on adapting existing language and speech technologies to the Greek medical context, addressing challenges such as complex medical terminology and linguistic inconsistencies. Through domain-specific fine-tuning, our system achieves more accurate and coherent transcriptions, contributing to the development of practical language technologies for the Greek healthcare sector.