Developing multilingual speech synthesis system for Ojibwe, Mi'kmaq, and Maliseet

📅 2025-02-04

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the critical scarcity of speech data for three typologically similar yet severely low-resource North American Indigenous languages—Ojibwe, Mi’kmaq, and Maliseet. Method: We propose the first lightweight, attention-free multilingual flow-matching text-to-speech (TTS) system for these languages. Our approach introduces joint multilingual training—a novel empirical validation for Indigenous language TTS—and leverages parameter sharing within a flow-matching architecture to enhance memory efficiency and cross-lingual generalization. Contribution/Results: (1) The multilingual model consistently outperforms monolingual baselines in naturalness and intelligibility, meeting requirements for language revitalization; (2) We develop a community-centered human evaluation framework that identifies and mitigates cultural biases inherent in conventional automatic and subjective metrics. This work establishes a reproducible technical pipeline and an ethically grounded evaluation framework for TTS in low-resource endangered languages.

Technology Category

Application Category

📝 Abstract

We present lightweight flow matching multilingual text-to-speech (TTS) systems for Ojibwe, Mi'kmaq, and Maliseet, three Indigenous languages in North America. Our results show that training a multilingual TTS model on three typologically similar languages can improve the performance over monolingual models, especially when data are scarce. Attention-free architectures are highly competitive with self-attention architecture with higher memory efficiency. Our research not only advances technical development for the revitalization of low-resource languages but also highlights the cultural gap in human evaluation protocols, calling for a more community-centered approach to human evaluation.

Problem

Research questions and friction points this paper is trying to address.

Multilingual speech synthesis for Indigenous languages

Improving TTS performance with scarce data

Community-centered human evaluation for cultural relevance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight flow matching multilingual TTS

Attention-free architectures enhance memory efficiency

Multilingual training improves low-resource language performance

🔎 Similar Papers

No similar papers found.

Authors to Follow