Route-DETR: Pairwise Query Routing in Transformers for Object Detection

📅 2025-12-15

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

DETR achieves end-to-end object detection but suffers from inefficient redundant competition among learnable queries in its decoder. To address this, we propose an adaptive pairwise query routing mechanism—the first to introduce dual asymmetric routing paths: suppression and delegation—guided by inter-query similarity, confidence scores, and geometric relationships. We further enhance self-attention modeling via a learnable low-rank attention bias and adopt a two-branch training strategy to optimize routing decisions. Crucially, the method incurs zero inference overhead—no additional computation is required during deployment. On COCO, our approach improves the ResNet-50-based DINO baseline by +1.7% mAP; on Cityscapes, it achieves 57.6% mAP using a Swin-L backbone, surpassing prior state-of-the-art. Our core contribution is the first structured query-routing framework explicitly designed to mitigate query competition, achieving simultaneous gains in both efficiency and accuracy.

Technology Category

Application Category

📝 Abstract

Detection Transformer (DETR) offers an end-to-end solution for object detection by eliminating hand-crafted components like non-maximum suppression. However, DETR suffers from inefficient query competition where multiple queries converge to similar positions, leading to redundant computations. We present Route-DETR, which addresses these issues through adaptive pairwise routing in decoder self-attention layers. Our key insight is distinguishing between competing queries (targeting the same object) versus complementary queries (targeting different objects) using inter-query similarity, confidence scores, and geometry. We introduce dual routing mechanisms: suppressor routes that modulate attention between competing queries to reduce duplication, and delegator routes that encourage exploration of different regions. These are implemented via learnable low-rank attention biases enabling asymmetric query interactions. A dual-branch training strategy incorporates routing biases only during training while preserving standard attention for inference, ensuring no additional computational cost. Experiments on COCO and Cityscapes demonstrate consistent improvements across multiple DETR baselines, achieving +1.7% mAP gain over DINO on ResNet-50 and reaching 57.6% mAP on Swin-L, surpassing prior state-of-the-art models.

Problem

Research questions and friction points this paper is trying to address.

Addresses inefficient query competition in DETR causing redundant computations

Introduces adaptive pairwise routing to distinguish competing and complementary queries

Enhances object detection accuracy without adding inference computational cost

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive pairwise routing in decoder self-attention layers

Dual routing mechanisms: suppressor and delegator routes

Learnable low-rank attention biases enabling asymmetric query interactions

🔎 Similar Papers

SimPLR: A Simple and Plain Transformer for Efficient Object Detection and Segmentation

2023-10-09Citations: 0

Authors to Follow