POLO - Point-based, multi-class animal detection

📅 2024-10-15
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
In wildlife drone monitoring, multi-class animal detection typically relies on costly, labor-intensive bounding-box annotations with inherent label ambiguity. To address this, we propose the first end-to-end multi-class animal detection framework supervised solely by point-level labels. Building upon YOLOv8, we redesign the prediction head and post-processing pipeline, introducing a point-level IoU loss, center-point refinement strategy, and density-aware non-maximum suppression (NMS) to enable precise detection and counting driven exclusively by single-point localization. Evaluated on an aerial waterbird dataset, our method achieves significantly higher counting accuracy than standard YOLOv8 under identical annotation cost and demonstrates strong robustness in high-density scenes involving thousands of individuals. This work establishes a novel paradigm for multi-class fine-grained animal detection under point supervision.

Technology Category

Application Category

📝 Abstract
Automated wildlife surveys based on drone imagery and object detection technology are a powerful and increasingly popular tool in conservation biology. Most detectors require training images with annotated bounding boxes, which are tedious, expensive, and not always unambiguous to create. To reduce the annotation load associated with this practice, we develop POLO, a multi-class object detection model that can be trained entirely on point labels. POLO is based on simple, yet effective modifications to the YOLOv8 architecture, including alterations to the prediction process, training losses, and post-processing. We test POLO on drone recordings of waterfowl containing up to multiple thousands of individual birds in one image and compare it to a regular YOLOv8. Our experiments show that at the same annotation cost, POLO achieves improved accuracy in counting animals in aerial imagery.
Problem

Research questions and friction points this paper is trying to address.

Reduces annotation load for wildlife detection models
Enables training on point labels instead of bounding boxes
Improves accuracy in counting animals in aerial imagery
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses point labels for multi-class detection
Modifies YOLOv8 architecture for efficiency
Improves accuracy in aerial animal counting
🔎 Similar Papers
No similar papers found.
G
Giacomo May
EPFL, Switzerland
Emanuele Dalsasso
Emanuele Dalsasso
Post-doctoral researcher, EPFL
Remote SensingDeep LearningMachine LearningSAR
B
B. Kellenberger
University College London, U.K.
D
D. Tuia
EPFL, Switzerland