🤖 AI Summary
Electronic synthesizer preset conversion faces challenges including strong coupling between timbre and ADSR envelopes, the inability of existing methods to explicitly control envelope parameters, and the absence of envelope-diversity-annotated datasets. To address these, we propose the first factorized encoder-decoder architecture that disentangles timbre, ADSR envelope, and musical content—achieved via joint optimization of spectrogram reconstruction, contrastive learning, and an ADSR-aware loss for explicit envelope modeling and independent editing. We introduce SynthCAT, a dedicated dataset encompassing diverse synthesizer types and extensive ADSR configurations. Experiments demonstrate significant improvements over baselines in both objective metrics and subjective listening evaluations, achieving breakthroughs in preset conversion fidelity and fine-grained controllability. Our code, models, and audio examples are publicly released.
📝 Abstract
Electronic synthesizer sounds are controlled by presets, parameters settings that yield complex timbral characteristics and ADSR envelopes, making preset conversion particularly challenging. Recent approaches to timbre transfer often rely on spectral objectives or implicit style matching, offering limited control over envelope shaping. Moreover, public synthesizer datasets rarely provide diverse coverage of timbres and ADSR envelopes. To address these gaps, we present SynthCloner, a factorized codec model that disentangles audio into three attributes: ADSR envelope, timbre, and content. This separation enables expressive synthesizer preset conversion with independent control over these three attributes. Additionally, we introduce SynthCAT, a new synthesizer dataset with a task-specific rendering pipeline covering 250 timbres, 120 ADSR envelopes, and 100 MIDI sequences. Experiments show that SynthCloner outperforms baselines on both objective and subjective metrics, while enabling independent attribute control. The code, model checkpoint, and audio examples are available at https://buffett0323.github.io/synthcloner/.