Towards Visual Re-Identification of Fish using Fine-Grained Classification for Electronic Monitoring in Fisheries

📅 2025-12-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

To address the challenge of manual re-identification of visually similar fish individuals in large-scale electronic monitoring videos for fisheries, this paper proposes an automated fine-grained fish Re-ID framework. We introduce a novel hard triplet mining strategy and a dataset-adaptive normalization image transformation pipeline, revealing through systematic analysis that viewpoint variation poses greater difficulty than occlusion. Our method employs the Swin-T Vision Transformer as the backbone for deep metric learning, integrating hard triplet loss with a customized data augmentation pipeline. Evaluated on the AutoFish dataset, it achieves 90.43% Rank-1 accuracy and 41.65% mAP@k—substantially outperforming ResNet-50—and demonstrates the superiority of Transformer architectures for fine-grained fish re-identification. This work provides a scalable, robust technical foundation for intelligent marine resource monitoring and management.

Technology Category

Application Category

📝 Abstract

Accurate fisheries data are crucial for effective and sustainable marine resource management. With the recent adoption of Electronic Monitoring (EM) systems, more video data is now being collected than can be feasibly reviewed manually. This paper addresses this challenge by developing an optimized deep learning pipeline for automated fish re-identification (Re-ID) using the novel AutoFish dataset, which simulates EM systems with conveyor belts with six similarly looking fish species. We demonstrate that key Re-ID metrics (R1 and mAP@k) are substantially improved by using hard triplet mining in conjunction with a custom image transformation pipeline that includes dataset-specific normalization. By employing these strategies, we demonstrate that the Vision Transformer-based Swin-T architecture consistently outperforms the Convolutional Neural Network-based ResNet-50, achieving peak performance of 41.65% mAP@k and 90.43% Rank-1 accuracy. An in-depth analysis reveals that the primary challenge is distinguishing visually similar individuals of the same species (Intra-species errors), where viewpoint inconsistency proves significantly more detrimental than partial occlusion. The source code and documentation are available at: https://github.com/msamdk/Fish_Re_Identification.git

Problem

Research questions and friction points this paper is trying to address.

Automated fish re-identification from electronic monitoring video data

Distinguishing visually similar fish species for sustainable fisheries management

Improving re-identification accuracy using deep learning and dataset-specific transformations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hard triplet mining for improved re-identification metrics

Custom image transformation with dataset-specific normalization

Swin-T Vision Transformer outperforms ResNet-50 CNN

🔎 Similar Papers

Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images

2024-07-10arXiv.orgCitations: 2

Flatfish Disease Detection Based on Part Segmentation Approach and Disease Image Generation

2024-07-16arXiv.orgCitations: 0

Authors to Follow