MultiAiTutor: Child-Friendly Educational Multilingual Speech Generation Tutor with LLMs

📅 2025-08-12

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Low-quality speech synthesis and poor cultural adaptation hinder multilingual children’s education in low-resource language settings. Method: We propose the first child-centered multilingual text-to-speech (TTS) framework supporting under-resourced languages—including Singaporean Mandarin, Malay, and Tamil—by integrating large language models (LLMs) with multilingual TTS. To enhance cultural relevance, we introduce a culturally aware image captioning task to guide content generation; we further incorporate age-appropriate linguistic modeling and a dual-dimensional evaluation protocol combining objective metrics (e.g., MOS, WER) with child-in-the-loop subjective feedback. Contribution/Results: Experiments demonstrate significant improvements over baselines in speech naturalness, cultural appropriateness, and child comprehension. The framework effectively boosts learning engagement and second-language acquisition outcomes in real-world educational contexts.

Technology Category

Application Category

📝 Abstract

Generative speech models have demonstrated significant potential in personalizing teacher-student interactions, offering valuable real-world applications for language learning in children's education. However, achieving high-quality, child-friendly speech generation remains challenging, particularly for low-resource languages across diverse languages and cultural contexts. In this paper, we propose MultiAiTutor, an educational multilingual generative AI tutor with child-friendly designs, leveraging LLM architecture for speech generation tailored for educational purposes. We propose to integrate age-appropriate multilingual speech generation using LLM architectures, facilitating young children's language learning through culturally relevant image-description tasks in three low-resource languages: Singaporean-accent Mandarin, Malay, and Tamil. Experimental results from both objective metrics and subjective evaluations demonstrate the superior performance of the proposed MultiAiTutor compared to baseline methods.

Problem

Research questions and friction points this paper is trying to address.

Child-friendly multilingual speech generation for education

High-quality speech synthesis for low-resource languages

Culturally relevant language learning for young children

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multilingual child-friendly speech generation using LLMs

Age-appropriate educational content in low-resource languages

Culturally relevant image-description tasks for learning

🔎 Similar Papers

No similar papers found.

Authors to Follow