RealClass: A Framework for Classroom Speech Simulation with Public Datasets and Game Engines

📅 2025-10-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The education domain suffers from a severe scarcity of large-scale, publicly available classroom speech data; existing datasets are small, non-open, and lack classroom-specific noise and room impulse responses (RIRs), hindering robust speech model training and effective data augmentation. Method: We propose the first game-engine-based classroom acoustic scene simulation framework, leveraging Unity’s real-time rendering to synthesize high-fidelity, configurable RIRs and background noise. Integrating children’s speech, instructional video audio, and public corpora, we construct RealClass—a synthetic classroom speech dataset enabling end-to-end controllable speech synthesis and diverse noisy sample generation. Contribution/Results: Experiments demonstrate that RealClass closely matches real classroom speech in both acoustic characteristics and ASR performance. When used for pretraining or fine-tuning, it significantly improves generalization of educational speech models under both clean and noisy conditions.

Technology Category

Application Category

📝 Abstract
The scarcity of large-scale classroom speech data has hindered the development of AI-driven speech models for education. Classroom datasets remain limited and not publicly available, and the absence of dedicated classroom noise or Room Impulse Response (RIR) corpora prevents the use of standard data augmentation techniques. In this paper, we introduce a scalable methodology for synthesizing classroom noise and RIRs using game engines, a versatile framework that can extend to other domains beyond the classroom. Building on this methodology, we present RealClass, a dataset that combines a synthesized classroom noise corpus with a classroom speech dataset compiled from publicly available corpora. The speech data pairs a children's speech corpus with instructional speech extracted from YouTube videos to approximate real classroom interactions in clean conditions. Experiments on clean and noisy speech show that RealClass closely approximates real classroom speech, making it a valuable asset in the absence of abundant real classroom speech.
Problem

Research questions and friction points this paper is trying to address.

Addresses classroom speech data scarcity for AI education models
Synthesizes classroom noise and acoustic responses using game engines
Creates realistic classroom speech dataset from public resources
Innovation

Methods, ideas, or system contributions that make the work stand out.

Synthesizing classroom noise using game engines
Creating Room Impulse Responses with game engines
Combining children's speech with instructional YouTube audio
🔎 Similar Papers
No similar papers found.
Ahmed Adel Attia
Ahmed Adel Attia
University Of Maryland
J
Jing Liu
University of Maryland, College of Education
C
Carol Espy Wilson
University of Maryland, Department of Electrical and Computer Engineering