đ¤ AI Summary
To address the challenge of predicting music popularity in the streaming era, this paper proposes an end-to-end deep learning framework that fuses audio and multi-source platform features. Methodologically, it introduces the first joint modeling of STFT spectrogramsâcapturing acoustic characteristicsâand Spotifyâs structured featuresâincluding metadata and user interaction signalsâvia a convolutional neural network for cross-modal feature fusion and discriminative pattern learning. The model achieves robust generalization across genres and time periods, overcoming the limitations of conventional unimodal approaches. Evaluated on a large-scale, multi-genre dataset, it attains an F1 score of 97%, significantly outperforming state-of-the-art baselines. This work offers both theoretical innovationâthrough principled multimodal integrationâand practical applicability, delivering an interpretable, production-ready solution for intelligent music recommendation and artist-and-repertoire (A&R) decision support.
đ Abstract
In the digital streaming landscape, it's becoming increasingly challenging for artists and industry experts to predict the success of music tracks. This study introduces a pioneering methodology that uses Convolutional Neural Networks (CNNs) and Spotify data analysis to forecast the popularity of music tracks. Our approach takes advantage of Spotify's wide range of features, including acoustic attributes based on the spectrogram of audio waveform, metadata, and user engagement metrics, to capture the complex patterns and relationships that influence a track's popularity. Using a large dataset covering various genres and demographics, our CNN-based model shows impressive effectiveness in predicting the popularity of music tracks. Additionally, we've conducted extensive experiments to assess the strength and adaptability of our model across different musical styles and time periods, with promising results yielding a 97% F1 score. Our study not only offers valuable insights into the dynamic landscape of digital music consumption but also provides the music industry with advanced predictive tools for assessing and predicting the success of music tracks.