Clustering and novel class recognition: evaluating bioacoustic deep learning feature extractors

📅 2025-04-09

📈 Citations: 0

✨ Influential: 0

career value

226K/year

🤖 AI Summary

Current bioacoustic model evaluation relies on downstream classification tasks, limiting cross-model and cross-taxa comparison of feature extractors’ generalization due to species-specific training constraints. Method: We propose the first unsupervised benchmark framework targeting embedding space structure, systematically evaluating acoustic feature extractors from 15 deep learning models—spanning diverse architectures, data sources, and training paradigms. Using t-SNE/UMAP visualization, K-means/DBSCAN clustering, and kNN classification, we quantitatively assess semantic separability and geometric properties of embeddings while decoupling classifier influence and class priors. Contribution/Results: We find that training paradigms profoundly shape embedding space geometry; several models exhibit robust cross-species clustering, enabling zero-shot recognition and transfer learning. This work establishes a reproducible, distribution-agnostic, and taxonomy-agnostic quantitative evaluation paradigm for bioacoustic feature extractors.

Technology Category

Application Category

📝 Abstract

In computational bioacoustics, deep learning models are composed of feature extractors and classifiers. The feature extractors generate vector representations of the input sound segments, called embeddings, which can be input to a classifier. While benchmarking of classification scores provides insights into specific performance statistics, it is limited to species that are included in the models' training data. Furthermore, it makes it impossible to compare models trained on very different taxonomic groups. This paper aims to address this gap by analyzing the embeddings generated by the feature extractors of 15 bioacoustic models spanning a wide range of setups (model architectures, training data, training paradigms). We evaluated and compared different ways in which models structure embedding spaces through clustering and kNN classification, which allows us to focus our comparison on feature extractors independent of their classifiers. We believe that this approach lets us evaluate the adaptability and generalization potential of models going beyond the classes they were trained on.

Problem

Research questions and friction points this paper is trying to address.

Evaluating bioacoustic feature extractors for clustering and novel class recognition

Comparing embedding spaces of models trained on diverse taxonomic groups

Assessing adaptability and generalization beyond trained classes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzing embeddings from diverse bioacoustic models

Evaluating feature extractors via clustering and kNN

Assessing model generalization beyond training classes

🔎 Similar Papers

Classification of Heart Sounds Using Multi-Branch Deep Convolutional Network and LSTM-CNN