Transfer Learning for Passive Sonar Classification using Pre-trained Audio and ImageNet Models

πŸ“… 2024-09-20
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses two key challenges in underwater active target recognition (UATR): severe scarcity of labeled data and limited effectiveness of cross-modal transfer learning. We systematically compare the transferability of ImageNet-pretrained vision models (e.g., ResNet, EfficientNet) versus domain-specific audio pre-trained models (PANNs) for few-shot sonar classification. Spectrograms are generated from raw sonar signals as input representations, and all models undergo identical data augmentation and fine-tuning protocols. Our results reveal that ImageNet-pretrained models achieve marginally higher classification accuracy than PANNs and demonstrate markedly superior robustness to low sampling-rate sonar dataβ€”an effect we identify as a previously unreported critical factor governing pretraining-finetuning performance. This study establishes the efficacy of vision-based cross-modal transfer for underwater acoustic recognition and proposes a novel, resource-efficient paradigm for low-data UATR.

Technology Category

Application Category

πŸ“ Abstract
Transfer learning is commonly employed to leverage large, pre-trained models and perform fine-tuning for downstream tasks. The most prevalent pre-trained models are initially trained using ImageNet. However, their ability to generalize can vary across different data modalities. This study compares pre-trained Audio Neural Networks (PANNs) and ImageNet pre-trained models within the context of underwater acoustic target recognition (UATR). It was observed that the ImageNet pre-trained models slightly out-perform pre-trained audio models in passive sonar classification. We also analyzed the impact of audio sampling rates for model pre-training and fine-tuning. This study contributes to transfer learning applications of UATR, illustrating the potential of pre-trained models to address limitations caused by scarce, labeled data in the UATR domain.
Problem

Research questions and friction points this paper is trying to address.

Compares pre-trained models for underwater acoustic target recognition.
Evaluates performance of ImageNet vs. audio models in sonar classification.
Explores transfer learning to overcome scarce labeled data in UATR.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes pre-trained models for underwater acoustic classification
Compares ImageNet and PANNs for transfer learning efficiency
Explores impact of audio sampling rates on model performance
πŸ”Ž Similar Papers
No similar papers found.
Amirmohammad Mohammadi
Amirmohammad Mohammadi
Texas A&M University
Machine LearningDeep LearningTime-SeriesComputer VisionParameter Efficient Transfer Learning
T
Tejashri Kelhe
Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX, USA
D
Davelle Carreiro
Massachusetts Institute of Technology Lincoln Laboratory, Lexington, MA, USA
A
A. V. Dine
Massachusetts Institute of Technology Lincoln Laboratory, Lexington, MA, USA
Joshua Peeples
Joshua Peeples
Assistant Professor, Texas A&M University
Machine LearningComputer VisionImage ProcessingTexture Analysis