G2G: Exploiting Intra-Group Geometry for Inter-Group Pose Estimation

📅 2026-06-06

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the problem of six-degree-of-freedom relative pose estimation across image groups, with applications in cross-sequence relocalization and multi-camera odometry. Building upon a frozen, pretrained multi-view backbone, the authors propose three lightweight modules: a perception-aware resampler, a cross-group bridging module enhanced with fusion-based self-attention, and a multi-frame pose head. Trained solely with relative pose supervision, the method effectively leverages intra-group geometric structure to enable efficient cross-group inference without fine-tuning the underlying backbone. Evaluated on four diverse datasets spanning indoor and outdoor scenes, cross-seasonal conditions, and zero-shot simulation-to-real transfer, the approach achieves state-of-the-art accuracy while using fewer than 6% of the original model’s trainable parameters, substantially reducing training overhead and enhancing generalization.

📝 Abstract

Recovering the relative 6-DoF pose between two image groups underlies cross-sequence relocalization and multi-camera rig odometry. Each group carries known intra-group geometry from visual odometry or rig calibration, and pretrained multi-view backbones already fuse such geometry into visual features. Yet current models treat all views as an unstructured set, leaving cross-group reasoning as the missing piece. We introduce \ours{}, which keeps the foundation model entirely frozen and adds three lightweight trainable modules to bridge the two groups: a perceiver resampler, a cross-group bridge with merged self-attention, and a multi-frame pose head. The trainable footprint totals about 32M parameters, under 6\% of the full model, and is supervised only by relative poses. Across four datasets that span indoor and outdoor simulation, real-world cross-season capture, and zero-shot sim-to-real transfer, \ours{} attains state-of-the-art accuracy on both tasks, while every baseline is retrained with its full original supervision. Code is available at https://github.com/WeiYuFei0217/G2G.

Problem

Research questions and friction points this paper is trying to address.

pose estimation

intra-group geometry

inter-group reasoning

6-DoF pose

multi-view

Innovation

Methods, ideas, or system contributions that make the work stand out.

intra-group geometry

cross-group pose estimation

frozen foundation model