Music for All: Exploring Multicultural Representations in Music Generation Models (Camera Ready)

πŸ“… 2025-02-11
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work identifies and addresses cultural representation imbalance in music generation models: non-Western genres constitute only 5.7% of duration in prevalent training datasets, leading to significant cross-cultural performance disparities. We present the first systematic quantification of this cultural bias and propose a parameter-efficient fine-tuning (PEFT)-based fair cross-cultural transfer baseline. Within the MusicGen and MUS-Tango frameworks, we conduct few-shot adaptation experiments on Hindustani classical and Turkish maqam music using a newly curated multicultural dataset. Results demonstrate that PEFT substantially improves generative fidelity for non-Western genres, validating its efficacy in mitigating cultural biasβ€”while also exposing fundamental challenges in few-shot cross-genre transfer. This study establishes the first empirical benchmark and methodological framework for multicultural AI music modeling.

Technology Category

Application Category

πŸ“ Abstract
The advent of Music-Language Models has greatly enhanced the automatic music generation capability of AI systems, but they are also limited in their coverage of the musical genres and cultures of the world. We present a study of the datasets and research papers for music generation and quantify the bias and under-representation of genres. We find that only 5.7% of the total hours of existing music datasets come from non-Western genres, which naturally leads to disparate performance of the models across genres. We then investigate the efficacy of Parameter-Efficient Fine-Tuning (PEFT) techniques in mitigating this bias. Our experiments with two popular models -- MusicGen and Mustango, for two underrepresented non-Western music traditions -- Hindustani Classical and Turkish Makam music, highlight the promises as well as the non-triviality of cross-genre adaptation of music through small datasets, implying the need for more equitable baseline music-language models that are designed for cross-cultural transfer learning.
Problem

Research questions and friction points this paper is trying to address.

Addressing bias in music datasets
Improving cross-genre music generation
Enhancing multicultural representation in AI models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter-Efficient Fine-Tuning techniques
Cross-genre adaptation through small datasets
Equitable baseline music-language models
πŸ”Ž Similar Papers
No similar papers found.
A
Atharva Mehta
Mohamed bin Zayed University of Artificial Intelligence
S
Shivam Chauhan
Mohamed bin Zayed University of Artificial Intelligence
Amirbek Djanibekov
Amirbek Djanibekov
PhD Student MBZUAI
Natural Language ProcessingSpeech Processing
A
Atharva Kulkarni
Mohamed bin Zayed University of Artificial Intelligence
G
Gus G. Xia
Mohamed bin Zayed University of Artificial Intelligence
Monojit Choudhury
Monojit Choudhury
Professor of Natural Language Processing, MBZUAI
Natural Language ProcessingLarge Language ModelsEthics of AIComputational Social Science