POLO - Point-based, multi-class animal detection

📅 2024-10-15

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

🤖 AI Summary

In wildlife drone monitoring, multi-class animal detection typically relies on costly, labor-intensive bounding-box annotations with inherent label ambiguity. To address this, we propose the first end-to-end multi-class animal detection framework supervised solely by point-level labels. Building upon YOLOv8, we redesign the prediction head and post-processing pipeline, introducing a point-level IoU loss, center-point refinement strategy, and density-aware non-maximum suppression (NMS) to enable precise detection and counting driven exclusively by single-point localization. Evaluated on an aerial waterbird dataset, our method achieves significantly higher counting accuracy than standard YOLOv8 under identical annotation cost and demonstrates strong robustness in high-density scenes involving thousands of individuals. This work establishes a novel paradigm for multi-class fine-grained animal detection under point supervision.

Technology Category

Application Category

📝 Abstract

Automated wildlife surveys based on drone imagery and object detection technology are a powerful and increasingly popular tool in conservation biology. Most detectors require training images with annotated bounding boxes, which are tedious, expensive, and not always unambiguous to create. To reduce the annotation load associated with this practice, we develop POLO, a multi-class object detection model that can be trained entirely on point labels. POLO is based on simple, yet effective modifications to the YOLOv8 architecture, including alterations to the prediction process, training losses, and post-processing. We test POLO on drone recordings of waterfowl containing up to multiple thousands of individual birds in one image and compare it to a regular YOLOv8. Our experiments show that at the same annotation cost, POLO achieves improved accuracy in counting animals in aerial imagery.

Problem

Research questions and friction points this paper is trying to address.

Reduces annotation load for wildlife detection models

Enables training on point labels instead of bounding boxes

Improves accuracy in counting animals in aerial imagery

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses point labels for multi-class detection

Modifies YOLOv8 architecture for efficiency

Improves accuracy in aerial animal counting

🔎 Similar Papers

No similar papers found.

Authors to Follow