SteerX: Creating Any Camera-Free 3D and 4D Scenes with Geometric Steering

📅 2025-03-15

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing 3D/4D generation and reconstruction methods perform physics-based alignment independently at each stage, leading to accumulated geometric misalignment across stages. This work proposes a zero-shot geometric guidance framework that explicitly integrates pose-free scene reconstruction into the generative process, enabling joint optimization of generation and reconstruction. Our key contribution is the design of the first camera-pose-agnostic dual-geometry reward function, embedded within diffusion or autoregressive generative frameworks to provide gradient-based geometric supervision. This enables end-to-end, unsupervised, zero-shot geometric alignment. Evaluated on multiple benchmarks, our method achieves significant improvements in depth, surface normal, and motion consistency: 3D structural error is reduced by 27%, and 4D temporal geometric coherence is substantially enhanced.

Technology Category

Application Category

📝 Abstract

Recent progress in 3D/4D scene generation emphasizes the importance of physical alignment throughout video generation and scene reconstruction. However, existing methods improve the alignment separately at each stage, making it difficult to manage subtle misalignments arising from another stage. Here, we present SteerX, a zero-shot inference-time steering method that unifies scene reconstruction into the generation process, tilting data distributions toward better geometric alignment. To this end, we introduce two geometric reward functions for 3D/4D scene generation by using pose-free feed-forward scene reconstruction models. Through extensive experiments, we demonstrate the effectiveness of SteerX in improving 3D/4D scene generation.

Problem

Research questions and friction points this paper is trying to address.

Unifies scene reconstruction with generation process

Improves geometric alignment in 3D/4D scenes

Introduces geometric reward functions for better alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Zero-shot inference-time steering method

Unifies scene reconstruction with generation

Introduces geometric reward functions

🔎 Similar Papers

No similar papers found.

Authors to Follow