Feature-based analysis of oral narratives from Afrikaans and isiXhosa children

📅 2025-07-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study identifies linguistic markers in oral narratives of Afrikaans- and isiXhosa-speaking children predictive of later literacy difficulties, supporting early intervention in multilingual contexts. Method: Leveraging a South African bilingual narrative corpus, we extracted features—including lexical diversity, mean utterance length, articulation rate, and part-of-speech distributions (particularly verbs and auxiliaries)—and applied lightweight machine learning classifiers. Contribution/Results: Lexical diversity and utterance length emerged as robust cross-linguistic developmental indicators; conversely, higher frequencies of specific verbs and auxiliaries significantly correlated with lower intervention need, revealing both language-specific and shared predictive patterns. This is the first systematic validation of narrative structure–literacy associations in major Southern African indigenous languages. The resulting interpretable, transferable computational framework enables scalable, resource-efficient multilingual early language assessment in low-resource settings.

Technology Category

Application Category

📝 Abstract
Oral narrative skills are strong predictors of later literacy development. This study examines the features of oral narratives from children who were identified by experts as requiring intervention. Using simple machine learning methods, we analyse recorded stories from four- and five-year-old Afrikaans- and isiXhosa-speaking children. Consistent with prior research, we identify lexical diversity (unique words) and length-based features (mean utterance length) as indicators of typical development, but features like articulation rate prove less informative. Despite cross-linguistic variation in part-of-speech patterns, the use of specific verbs and auxiliaries associated with goal-directed storytelling is correlated with a reduced likelihood of requiring intervention. Our analysis of two linguistically distinct languages reveals both language-specific and shared predictors of narrative proficiency, with implications for early assessment in multilingual contexts.
Problem

Research questions and friction points this paper is trying to address.

Analyzing oral narrative features in Afrikaans and isiXhosa children
Identifying language-specific and shared predictors of narrative proficiency
Assessing early literacy development using machine learning methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Simple machine learning for oral narrative analysis
Lexical diversity and utterance length as indicators
Cross-linguistic verb and auxiliary usage correlation
🔎 Similar Papers
No similar papers found.
Emma Sharratt
Emma Sharratt
Stellenbosch University
Natural language processingMachine learning
A
Annelien Smith
Speech, Language and Hearing Therapy, Stellenbosch University, South Africa
R
Retief Louw
Electrical and Electronic Engineering, Stellenbosch University, South Africa
D
Daleen Klop
Speech, Language and Hearing Therapy, Stellenbosch University, South Africa
F
Febe de Wet
Electrical and Electronic Engineering, Stellenbosch University, South Africa
Herman Kamper
Herman Kamper
Stellenbosch University
Speech RecognitionMachine Learning