Local Equivariance Error-Based Metrics for Evaluating Sampling-Frequency-Independent Property of Neural Network

📅 2025-06-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deep neural network (DNN) audio models are typically trained at a single sampling frequency (SF); when deployed on unseen SFs, input resampling is required—yet this often degrades performance, a critical issue largely overlooked in prior work. Method: We introduce the novel concept of “Sampling Frequency Independence” (SFI) and propose a quantitative evaluation framework based on Local Equivariance Error (LEE), extending LEE to SF transformations for robustness assessment of time-frequency masking modules. Contribution/Results: Through systematic audio resampling modeling, mask sensitivity analysis, and benchmarking on music source separation tasks, we demonstrate that our metric strongly correlates with cross-SF performance degradation (Pearson’s *r* > 0.92). This work provides an interpretable, reproducible, and task-specific evaluation tool to guide the design of SF-robust audio DNNs that eliminate reliance on resampling.

Technology Category

Application Category

📝 Abstract
Audio signal processing methods based on deep neural networks (DNNs) are typically trained only at a single sampling frequency (SF) and therefore require signal resampling to handle untrained SFs. However, recent studies have shown that signal resampling can degrade performance with untrained SFs. This problem has been overlooked because most studies evaluate only the performance at trained SFs. In this paper, to assess the robustness of DNNs to SF changes, which we refer to as the SF-independent (SFI) property, we propose three metrics to quantify the SFI property on the basis of local equivariance error (LEE). LEE measures the robustness of DNNs to input transformations. By using signal resampling as input transformation, we extend LEE to measure the robustness of audio source separation methods to signal resampling. The proposed metrics are constructed to quantify the SFI property in specific network components responsible for predicting time-frequency masks. Experiments on music source separation demonstrated a strong correlation between the proposed metrics and performance degradation at untrained SFs.
Problem

Research questions and friction points this paper is trying to address.

Evaluating neural network robustness to sampling frequency changes
Quantifying SF-independent property using local equivariance error
Assessing performance degradation in untrained sampling frequencies
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes local equivariance error-based metrics
Measures robustness to sampling frequency changes
Quantifies SF-independent property in networks
🔎 Similar Papers
No similar papers found.