🤖 AI Summary
This work addresses the automatic segmentation and tune-family classification of folk-song symbolic scores. We propose a multi-scale time-frequency analysis method based on the Haar continuous wavelet transform (CWT). Symbolic scores are modeled as discrete pitch–time sequences; structural features are extracted via Haar wavelet filtering, and phrase boundaries are precisely localized using local maxima of the wavelet coefficients. Tune-family identification is performed using a k-nearest neighbors classifier with either Euclidean or Manhattan distance. To our knowledge, this is the first application of Haar wavelet filtering to symbolic music segmentation and classification, significantly enhancing melodic structural perception and classification robustness. Cross-validation results demonstrate that the optimized scale-parameter configuration achieves substantially higher classification accuracy than conventional Gestalt-based baselines.
📝 Abstract
The aim of this study is to evaluate a machine-learning method in which symbolic representations of folk songs are segmented and classified into tune families with Haar-wavelet filtering. The method is compared with previously proposed Gestaltbased method. Melodies are represented as discrete symbolic pitch-time signals. We apply the continuous wavelet transform (CWT) with the Haar wavelet at specific scales, obtaining filtered versions of melodies emphasizing their information at particular time-scales. We use the filtered signal for representation and segmentation, using the wavelet coefficients’ local maxima to indicate local boundaries and classify segments by means of k-nearest neighbours based on standard vector-metrics (Euclidean, cityblock), and compare the results to a Gestalt-based segmentation method and metrics applied directly to the pitch signal. We found that the wavelet based segmentation and waveletfiltering of the pitch signal lead to better classification accuracy in cross-validated evaluation when the time-scale and other parameters are optimized.