AnchorWorld: Embodied Egocentric World Simulation with View-based Evolution Customization

📅 2026-06-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited flexibility and controllability in interactive world modeling by proposing a framework centered on 3D human motion as the primary interaction modality. It enhances egocentric spatial perception through supervision from an external viewpoint and introduces a unified coordinate system that jointly leverages anchor views and textual descriptions to drive dynamic, customizable evolution of local scenes. The method significantly outperforms state-of-the-art approaches in both spatiotemporal geometric consistency and adherence to text-guided scene evolution, thereby improving the completeness and controllability of interactive modeling.
📝 Abstract
Despite being a pivotal frontier, interactive world modeling remains underexplored in terms of the versatile controllability required by practical scenarios. To bridge this gap, we present AnchorWorld, a framework that advances egocentric simulation through enhanced interaction integrity and a flexible mechanism for world customization. First, we utilize 3D human motion as the primary interaction modality. To complement the out-of-view or truncated body parts in egocentric views, we introduce an auxiliary training supervision that incorporates exogenous viewpoints decoupled from the agent's first-person sensorium. It allows the model to observe the agent's full-body positioning relative to the environment, facilitating a more robust spatial grounding of human-world interactions. Furthermore, we propose a simple yet effective mechanism for customizing self-evolving worlds. This is achieved by defining anchor views within a unified world coordinate system, coupled with textual descriptions dictating the dynamic evolution of local scenes. Experimental results show that AnchorWorld significantly outperforms state-of-the-art baselines, while ablation studies validate the effectiveness of our key designs. Notably, our customization scheme exhibits promising spatio-temporal geometric consistency and adheres strictly to the prescribed evolutionary dynamics.
Problem

Research questions and friction points this paper is trying to address.

interactive world modeling
egocentric simulation
world customization
spatial grounding
dynamic evolution
Innovation

Methods, ideas, or system contributions that make the work stand out.

egocentric simulation
view-based customization
anchor views
3D human motion
world modeling
🔎 Similar Papers
2024-07-09IEEE/ASME transactions on mechatronicsCitations: 94