๐ค AI Summary
Existing multi-track music generation models often suffer from insufficient overall coherence due to neglecting rhythmic stability and inter-track synchronization. To address this, this work proposes SyncTrack, a novel framework that jointly models a shared rhythmic module with track-specific timbre modules, integrating cross-track attention mechanisms and learnable instrument priors to unify rhythmic consistency with individual track expressiveness. Additionally, three new evaluation metricsโInter-track Rhythmic Similarity (IRS), Cross-track Beat Synchrony (CBS), and Cross-track Beat Deviation (CBD)โare introduced to quantitatively assess rhythmic alignment across tracks. Experimental results demonstrate that SyncTrack significantly enhances both rhythmic stability and inter-track synchronization, achieving superior generation quality compared to current state-of-the-art methods.
๐ Abstract
Multi-track music generation has garnered significant research interest due to its precise mixing and remixing capabilities. However, existing models often overlook essential attributes such as rhythmic stability and synchronization, leading to a focus on differences between tracks rather than their inherent properties. In this paper, we introduce SyncTrack, a synchronous multi-track waveform music generation model designed to capture the unique characteristics of multi-track music. SyncTrack features a novel architecture that includes track-shared modules to establish a common rhythm across all tracks and track-specific modules to accommodate diverse timbres and pitch ranges. Each track-shared module employs two cross-track attention mechanisms to synchronize rhythmic information, while each track-specific module utilizes learnable instrument priors to better represent timbre and other unique features. Additionally, we enhance the evaluation of multi-track music quality by introducing rhythmic consistency through three novel metrics: Inner-track Rhythmic Stability (IRS), Cross-track Beat Synchronization (CBS), and Cross-track Beat Dispersion (CBD). Experiments demonstrate that SyncTrack significantly improves the multi-track music quality by enhancing rhythmic consistency.