Maximum Matching Accuracy: An Instance Segmentation Evaluation Metric Utilizing Globally Optimal Matching

📅 2026-06-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing instance segmentation evaluation metrics—such as AP@50, PQ, SEG, and AJI—are limited in biological imaging due to their reliance on hard IoU thresholds, object-level normalization biases, and greedy matching strategies, which lead to discontinuous scores, low sensitivity, and unreliable ranking. This work proposes Maximum Matching Accuracy (MMA), a threshold-free and continuous metric that establishes a globally optimal one-to-one correspondence between predicted and ground-truth instances via the Hungarian algorithm and quantifies total overlap using pixel-level normalization. Experimental results demonstrate that MMA substantially outperforms existing metrics across synthetic failure cases, progressive perturbations, and model ranking tasks, exhibiting superior stability, sensitivity, and interpretability.
📝 Abstract
Reliable evaluation of instance segmentation models requires metrics that accurately and consistently reflect segmentation quality. However, the metrics most widely used in biological imaging carry fundamental mathematical weaknesses: hard Intersection-over-Union (IoU) thresholds that produce discontinuous, low sensitivity scoring; per-object normalization that distorts scores under object size variation; and greedy or one-to-many matching procedures that yield non-optimal, order-dependent correspondences. Together, these properties produce unintuitive and unreliable model rankings under common failure modes such as split cells, merged cells, and cell boundary imprecision. We propose Maximum Matching Accuracy (MMA), a threshold-free continuous metric that finds a globally optimal one-to-one matching between predicted and ground truth objects and aggregates total overlap using per-pixel normalization. We evaluate MMA against AP@50, PQ, SEG, and AJI across three experiments: synthetic failure cases, progressive corruption tests, and a model ranking comparison. MMA produces scores that are more stable, more sensitive, and more interpretable than existing alternatives, providing a principled foundation for fair instance segmentation benchmarking in biological cell imaging.
Problem

Research questions and friction points this paper is trying to address.

instance segmentation
evaluation metric
biological imaging
model ranking
segmentation quality
Innovation

Methods, ideas, or system contributions that make the work stand out.

Maximum Matching Accuracy
instance segmentation
globally optimal matching
threshold-free metric
pixel-wise normalization