Automated evaluation of children's speech fluency for low-resource languages

📅 2025-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the challenge of automated fluency assessment for children’s speech in low-resource languages (Tamil and Malay). We propose an end-to-end lightweight framework: (1) robust speech-to-text transcription using fine-tuned multilingual ASR models (mBART/Whisper); (2) extraction of objective acoustic features—including phoneme error rate, speaking rate, and pause ratio; and (3) a novel integration of lightly fine-tuned GPT-based classifiers (GPT-3.5/4) that achieve high accuracy with minimal labeled data. To our knowledge, this is the first work to synergistically combine large language models with interpretable acoustic metrics for low-resource pediatric speech assessment. Experiments on Tamil and Malay child speech corpora yield a weighted F1-score of 86.5%, significantly outperforming ChatGPT-4o (+12.3%) and conventional machine learning baselines (+9.7%). The approach effectively alleviates the dual bottlenecks of scarce annotations and limited model generalizability in low-resource settings.

Technology Category

Application Category

📝 Abstract
Assessment of children's speaking fluency in education is well researched for majority languages, but remains highly challenging for low resource languages. This paper proposes a system to automatically assess fluency by combining a fine-tuned multilingual ASR model, an objective metrics extraction stage, and a generative pre-trained transformer (GPT) network. The objective metrics include phonetic and word error rates, speech rate, and speech-pause duration ratio. These are interpreted by a GPT-based classifier guided by a small set of human-evaluated ground truth examples, to score fluency. We evaluate the proposed system on a dataset of children's speech in two low-resource languages, Tamil and Malay and compare the classification performance against Random Forest and XGBoost, as well as using ChatGPT-4o to predict fluency directly from speech input. Results demonstrate that the proposed approach achieves significantly higher accuracy than multimodal GPT or other methods.
Problem

Research questions and friction points this paper is trying to address.

Automated fluency assessment for low-resource children's speech
Combining ASR, objective metrics, and GPT for multilingual evaluation
Improving accuracy over traditional methods in Tamil and Malay
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned multilingual ASR model
Objective metrics extraction stage
GPT-based classifier for fluency
🔎 Similar Papers
No similar papers found.
B
Bowen Zhang
ICT Cluster, Singapore Institute of Technology, Singapore; College of Computing & Data Science, Nanyang Technological University, Singapore
N
Nur Afiqah Abdul Latiff
ICT Cluster, Singapore Institute of Technology, Singapore
J
Justin Kan
ICT Cluster, Singapore Institute of Technology, Singapore
Rong Tong
Rong Tong
Virginia Tech
CancerBiomaterialsDrug DeliveryPolymerNanomedicine
D
D. Soh
ICT Cluster, Singapore Institute of Technology, Singapore
Xiaoxiao Miao
Xiaoxiao Miao
Duke Kunshan University
Speech PrivacySpeaker and Language IdentificationSpeech Synthesis
I
Ian Mcloughlin
ICT Cluster, Singapore Institute of Technology, Singapore