Automated evaluation of children's speech fluency for low-resource languages

📅 2025-05-26

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This study addresses the challenge of automated fluency assessment for children’s speech in low-resource languages (Tamil and Malay). We propose an end-to-end lightweight framework: (1) robust speech-to-text transcription using fine-tuned multilingual ASR models (mBART/Whisper); (2) extraction of objective acoustic features—including phoneme error rate, speaking rate, and pause ratio; and (3) a novel integration of lightly fine-tuned GPT-based classifiers (GPT-3.5/4) that achieve high accuracy with minimal labeled data. To our knowledge, this is the first work to synergistically combine large language models with interpretable acoustic metrics for low-resource pediatric speech assessment. Experiments on Tamil and Malay child speech corpora yield a weighted F1-score of 86.5%, significantly outperforming ChatGPT-4o (+12.3%) and conventional machine learning baselines (+9.7%). The approach effectively alleviates the dual bottlenecks of scarce annotations and limited model generalizability in low-resource settings.

Technology Category

Application Category

📝 Abstract

Assessment of children's speaking fluency in education is well researched for majority languages, but remains highly challenging for low resource languages. This paper proposes a system to automatically assess fluency by combining a fine-tuned multilingual ASR model, an objective metrics extraction stage, and a generative pre-trained transformer (GPT) network. The objective metrics include phonetic and word error rates, speech rate, and speech-pause duration ratio. These are interpreted by a GPT-based classifier guided by a small set of human-evaluated ground truth examples, to score fluency. We evaluate the proposed system on a dataset of children's speech in two low-resource languages, Tamil and Malay and compare the classification performance against Random Forest and XGBoost, as well as using ChatGPT-4o to predict fluency directly from speech input. Results demonstrate that the proposed approach achieves significantly higher accuracy than multimodal GPT or other methods.

Problem

Research questions and friction points this paper is trying to address.

Automated fluency assessment for low-resource children's speech

Combining ASR, objective metrics, and GPT for multilingual evaluation

Improving accuracy over traditional methods in Tamil and Malay

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned multilingual ASR model

Objective metrics extraction stage

GPT-based classifier for fluency

🔎 Similar Papers

Personalized Speech Recognition for Children with Test-Time Adaptation