🤖 AI Summary
Traditional triplet loss leverages only class labels and thus struggles to incorporate auxiliary annotations—such as bounding box coordinates—in multi-task object detection. To address this limitation, we propose Multi-Annotation Triplet Loss (MATL), the first triplet-based loss that explicitly embeds bounding box regression error into the triplet distance metric, enabling joint optimization of classification and localization. MATL jointly models class discriminability and localization consistency within a unified framework, without requiring additional network branches or auxiliary loss terms. Experiments on an aerial wildlife image dataset demonstrate that MATL significantly outperforms standard triplet loss, improving classification accuracy by +3.2% and localization precision (mAP@0.5) by +4.7%. These results validate the effectiveness of co-modeling heterogeneous annotations—class labels and bounding boxes—within a single, principled loss formulation.
📝 Abstract
Triplet loss traditionally relies only on class labels and does not use all available information in multi-task scenarios where multiple types of annotations are available. This paper introduces a Multi-Annotation Triplet Loss (MATL) framework that extends triplet loss by incorporating additional annotations, such as bounding box information, alongside class labels in the loss formulation. By using these complementary annotations, MATL improves multi-task learning for tasks requiring both classification and localization. Experiments on an aerial wildlife imagery dataset demonstrate that MATL outperforms conventional triplet loss in both classification and localization. These findings highlight the benefit of using all available annotations for triplet loss in multi-task learning frameworks.