An Effective Automated Speaking Assessment Approach to Mitigating Data Scarcity and Imbalanced Distribution

📅 2024-04-11
🏛️ NAACL-HLT
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Automatic Speech Assessment (ASA) faces three key challenges: scarcity of labeled data, imbalanced distribution of learners’ CEFR proficiency levels, and nonlinear score intervals across levels. To address these, this paper proposes a self-supervised embedding modeling framework integrating metric learning with curriculum-aware loss reweighting. First, it introduces contrastive metric classification to ASA—marking the first application of this paradigm—to mitigate class imbalance among CEFR levels. Second, it designs a dynamic loss reweighting mechanism calibrated to the nonuniform score intervals between adjacent CEFR levels. Third, it leverages wav2vec 2.0 for speech representation extraction and jointly optimizes both the embedding space geometry and decision boundaries. Evaluated on the ICNALE benchmark, the method achieves over a 10-percentage-point improvement in CEFR level prediction accuracy, substantially outperforming existing state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract
Automated speaking assessment (ASA) typically involves automatic speech recognition (ASR) and hand-crafted feature extraction from the ASR transcript of a learner's speech. Recently, self-supervised learning (SSL) has shown stellar performance compared to traditional methods. However, SSL-based ASA systems are faced with at least three data-related challenges: limited annotated data, uneven distribution of learner proficiency levels and non-uniform score intervals between different CEFR proficiency levels. To address these challenges, we explore the use of two novel modeling strategies: metric-based classification and loss reweighting, leveraging distinct SSL-based embedding features. Extensive experimental results on the ICNALE benchmark dataset suggest that our approach can outperform existing strong baselines by a sizable margin, achieving a significant improvement of more than 10% in CEFR prediction accuracy.
Problem

Research questions and friction points this paper is trying to address.

Mitigating data scarcity in automated speaking assessment
Addressing imbalanced learner proficiency level distribution
Improving CEFR prediction accuracy using novel modeling strategies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leveraging self-supervised learning embeddings
Applying metric-based classification strategy
Utilizing loss reweighting for imbalanced data
🔎 Similar Papers
No similar papers found.
T
Tien-Hong Lo
Department of Computer Science and Information Engineering, National Taiwan Normal University; Research Center for Psychological and Educational Testing, National Taiwan Normal University
F
Fu-An Chao
Research Center for Psychological and Educational Testing, National Taiwan Normal University
T
Tzu-I Wu
Department of Computer Science and Information Engineering, National Taiwan Normal University
Y
Yao-Ting Sung
Department of Educational Psychology and Counseling, National Taiwan Normal University
B
Berlin Chen
Department of Computer Science and Information Engineering, National Taiwan Normal University