🤖 AI Summary
Existing singing quality evaluation methods suffer from high subjectivity costs and insufficient perceptual coverage of objective metrics. To address this, we introduce SingEval—the first large-scale, multi-dimensional singing quality evaluation benchmark—comprising 7,981 singing segments generated by 41 models across 12 source datasets, with fine-grained Mean Opinion Score (MOS) annotations for lyric accuracy, pitch consistency, and overall quality. Leveraging professional human ratings and systematic benchmarking, we conduct the first comprehensive assessment of mainstream objective metrics—including PEAQ, CREPE, and DeepMOS—in the singing domain. SingEval significantly enhances the completeness and practicality of evaluation dimensions, providing a reproducible, open-source, state-of-the-art baseline. This work establishes a robust foundation for future research on singing synthesis quality evaluation.
📝 Abstract
Singing voice generation progresses rapidly, yet evaluating singing quality remains a critical challenge. Human subjective assessment, typically in the form of listening tests, is costly and time consuming, while existing objective metrics capture only limited perceptual aspects. In this work, we introduce SingMOS-Pro, a dataset for automatic singing quality assessment. Building on our preview version SingMOS, which provides only overall ratings, SingMOS-Pro expands annotations of the additional part to include lyrics, melody, and overall quality, offering broader coverage and greater diversity. The dataset contains 7,981 singing clips generated by 41 models across 12 datasets, spanning from early systems to recent advances. Each clip receives at least five ratings from professional annotators, ensuring reliability and consistency. Furthermore, we explore how to effectively utilize MOS data annotated under different standards and benchmark several widely used evaluation methods from related tasks on SingMOS-Pro, establishing strong baselines and practical references for future research. The dataset can be accessed at https://huggingface.co/datasets/TangRain/SingMOS-Pro.