🤖 AI Summary
Existing object detection models lack intuitive, fine-grained methods for performance comparison, making it difficult to uncover their shared and distinct failure modes in recognizing ground-truth labels. To address this, this work proposes Differences in Detection (DnD), a novel approach that introduces a structured set-partitioning mechanism based on standard matching algorithms. By decomposing model behaviors into intersections, differences, and co-missed sets, DnD enables direct pairwise comparison and integrates the TIDE error taxonomy to construct an interpretable confusion matrix. Moving beyond conventional metrics like mAP and isolated error statistics, the method clearly delineates shared versus unique errors, thereby guiding interpretability techniques—such as ODAM—to prioritize critical samples that reveal meaningful discrepancies between detectors.
📝 Abstract
We propose Differences in Detection (DnD), an intuitive method to compare two object detection models. Based on the same matching algorithm, it complements the standard metrics of mean Average Precision ($mAP$) and TIDE error analysis with the ability to compare two models directly. More specifically, we calculate the intersection of ground truth labels that are recognized by both models, followed by the corresponding difference sets and the complement set of ground truth labels that are missed by both models. The resulting comparison is more direct and intuitive than a comparison of independent summary statistics. It reveals individual and shared mistakes and becomes particularly interesting when combined with error types. In this case, the differences in detection errors can be analyzed naturally in a standard confusion matrix. While valuable in itself, we believe that one of the best applications of DnD is to guide explainability methods such as ODAM towards metric-relevant examples, grounded in structured subsets. The code for our method is available here: https://github.com/JohannesTheo/differences-in-detection