Generative 6D Pose Estimation via Conditional Flow Matching

📅 2026-02-23

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This work addresses the challenge of 6D object pose estimation under pose ambiguity and outlier sensitivity, particularly for symmetric objects or those lacking distinctive local features. To mitigate these issues, we propose a novel approach that, for the first time, integrates appearance-based semantic features into a conditional flow matching framework, formulating pose estimation as a conditional generative denoising process in ℝ³. By jointly optimizing local geometric and semantic information, our method effectively resolves ambiguities induced by object symmetry. Furthermore, we incorporate RANSAC to achieve robust pose registration. Evaluated on five datasets from the BOP benchmark, our approach achieves an average recall improvement of 4.5% over state-of-the-art methods, demonstrating significant performance gains.

Technology Category

Application Category

📝 Abstract

Existing methods for instance-level 6D pose estimation typically rely on neural networks that either directly regress the pose in $\mathrm{SE}(3)$ or estimate it indirectly via local feature matching. The former struggle with object symmetries, while the latter fail in the absence of distinctive local features. To overcome these limitations, we propose a novel formulation of 6D pose estimation as a conditional flow matching problem in $\mathbb{R}^3$. We introduce Flose, a generative method that infers object poses via a denoising process conditioned on local features. While prior approaches based on conditional flow matching perform denoising solely based on geometric guidance, Flose integrates appearance-based semantic features to mitigate ambiguities caused by object symmetries. We further incorporate RANSAC-based registration to handle outliers. We validate Flose on five datasets from the established BOP benchmark. Flose outperforms prior methods with an average improvement of +4.5 Average Recall. Project Website : https://tev-fbk.github.io/Flose/

Problem

Research questions and friction points this paper is trying to address.

6D pose estimation

object symmetries

local features

pose ambiguity

instance-level

Innovation

Methods, ideas, or system contributions that make the work stand out.

conditional flow matching

6D pose estimation

generative modeling