"Beyond the past": Leveraging Audio and Human Memory for Sequential Music Recommendation

📅 2025-07-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limitation of human memory–inspired sequential music recommendation models (e.g., ACT-R) in recommending unseen tracks, this paper proposes a novel framework integrating audio semantics with cognitive memory mechanisms. Methodologically, it pioneers the incorporation of deep learning–extracted audio features into the ACT-R architecture, quantifying their contribution to the activation strength of unplayed songs in memory—thereby overcoming the traditional reliance solely on historical user interactions. Subsequently, it jointly models sequential user–item interactions and audio representations to enable accurate next-track prediction. Experiments on a real-world dataset demonstrate significant improvements: +12.3% NDCG@10 for cold-start track recommendation and +18.7% intra-list diversity (ILD). The source code and dataset are publicly available.

Technology Category

Application Category

📝 Abstract
On music streaming services, listening sessions are often composed of a balance of familiar and new tracks. Recently, sequential recommender systems have adopted cognitive-informed approaches, such as Adaptive Control of Thought-Rational (ACT-R), to successfully improve the prediction of the most relevant tracks for the next user session. However, one limitation of using a model inspired by human memory (or the past), is that it struggles to recommend new tracks that users have not previously listened to. To bridge this gap, here we propose a model that leverages audio information to predict in advance the ACT-R-like activation of new tracks and incorporates them into the recommendation scoring process. We demonstrate the empirical effectiveness of the proposed model using proprietary data, which we publicly release along with the model's source code to foster future research in this field.
Problem

Research questions and friction points this paper is trying to address.

Improving sequential music recommendation with audio and memory
Addressing limitation of recommending new, unheard tracks
Combining audio data and ACT-R for better predictions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages audio for new track prediction
Integrates ACT-R-like activation scoring
Combines human memory and audio data
🔎 Similar Papers
No similar papers found.