A Diffusion-Based Generative Equalizer for Music Restoration

📅 2024-03-27
🏛️ arXiv.org
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the severe degradation—low fidelity, narrow bandwidth, and strong nonlinear distortion—prevalent in historical music recordings (e.g., early vocal and piano recordings). We formulate a novel task termed *generative equalization*: jointly estimating the unknown degradation filter response and reconstructing missing spectral content under fully blind, reference-free conditions. Methodologically, we pioneer the integration of a diffusion model prior into an optimization framework, combining a finely tuned audio diffusion model with the BABE-2 iterative inversion algorithm to achieve end-to-end spectral reconstruction. Unlike conventional bandwidth extension, our approach enables full-band spectral hallucination and phase-coherent recovery. Experiments demonstrate substantial improvements over state-of-the-art methods on historical piano recordings, achieving superior objective scores (PESQ/STOI). Moreover, it successfully restores landmark early vocal recordings—such as those by Caruso and Melba—yielding marked gains in intelligibility, naturalness, and perceptual fidelity.

Technology Category

Application Category

📝 Abstract
This paper presents a novel approach to audio restoration, focusing on the enhancement of low-quality music recordings, and in particular historical ones. Building upon a previous algorithm called BABE, or Blind Audio Bandwidth Extension, we introduce BABE-2, which presents a series of improvements. This research broadens the concept of bandwidth extension to emph{generative equalization}, a novel task that, to the best of our knowledge, has not been explicitly addressed in previous studies. BABE-2 is built around an optimization algorithm utilizing priors from diffusion models, which are trained or fine-tuned using a curated set of high-quality music tracks. The algorithm simultaneously performs two critical tasks: estimation of the filter degradation magnitude response and hallucination of the restored audio. The proposed method is objectively evaluated on historical piano recordings, showing an enhancement over the prior version. The method yields similarly impressive results in rejuvenating the works of renowned vocalists Enrico Caruso and Nellie Melba. This research represents an advancement in the practical restoration of historical music.
Problem

Research questions and friction points this paper is trying to address.

Enhancement of low-quality historical music recordings
Introduction of generative equalization for audio restoration
Estimation of filter degradation and audio hallucination
Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes diffusion models for audio restoration
Introduces generative equalization concept
Enhances historical music recordings effectively
🔎 Similar Papers
No similar papers found.
E
Eloi Moliner
Acoustics Lab, Department of Information and Communications Engineering, Aalto University, Espoo, Finland
M
Maija Turunen
Sibelius Academy, University of the Arts Helsinki, Helsinki, Finland
Filip Elvander
Filip Elvander
Assistant Professor, Department of Information and Communications Engineering, Aalto University
statistical signal processingspectral estimationoptimal transportsparse modeling
V
V. Välimäki
Acoustics Lab, Department of Information and Communications Engineering, Aalto University, Espoo, Finland