ECHO: Frequency-aware Hierarchical Encoding for Variable-length Signal

📅 2025-08-20

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing subband encoders suffer from fixed input-length constraints and lack of frequency-position awareness, limiting their ability to model variable-length industrial signals (e.g., acoustic or vibration data). To address this, we propose the first time-frequency foundation model supporting arbitrary-length inputs: (1) a band-splitting architecture enables adaptive subband decomposition; (2) relative frequency position encoding explicitly captures spectral structure; and (3) a hierarchical band encoding and feature aggregation mechanism eliminates the need for segmentation or padding. The model is pre-trained at scale via contrastive learning on the SIREN benchmark. It achieves state-of-the-art performance on industrial anomaly detection and fault classification tasks, with significant improvements in cross-sampling-rate generalization and cross-domain robustness.

Technology Category

Application Category

📝 Abstract

Pre-trained foundation models have demonstrated remarkable success in vision and language, yet their potential for general machine signal modeling-covering acoustic, vibration, and other industrial sensor data-remains under-explored. Existing approach using sub-band-based encoders has achieved competitive results but are limited by fixed input lengths, and the absence of explicit frequency positional encoding. In this work, we propose a novel foundation model that integrates an advanced band-split architecture with relative frequency positional embeddings, enabling precise spectral localization across arbitrary sampling configurations. The model supports inputs of arbitrary length without padding or segmentation, producing a concise embedding that retains both temporal and spectral fidelity. We evaluate our method on SIREN (https://github.com/yucongzh/SIREN), a newly introduced large-scale benchmark for machine signal encoding that unifies multiple datasets, including all DCASE task 2 challenges (2020-2025) and widely-used industrial signal corpora. Experimental results demonstrate consistent state-of-the-art performance in anomaly detection and fault identification, confirming the effectiveness and generalization capability of the proposed model. We open-sourced ECHO on https://github.com/yucongzh/ECHO.

Problem

Research questions and friction points this paper is trying to address.

Enabling variable-length signal processing without padding or segmentation

Addressing absence of explicit frequency positional encoding in signals

Improving anomaly detection and fault identification in industrial signals

Innovation

Methods, ideas, or system contributions that make the work stand out.

Band-split architecture with relative frequency embeddings

Arbitrary input lengths without padding or segmentation

Concise embedding retaining temporal and spectral fidelity

🔎 Similar Papers

No similar papers found.

Authors to Follow