Ego-Pi: VLA Fine-Tuning for Ego-Centric Human and Robot Data

πŸ“… 2026-06-06
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the scarcity of large-scale robotic data that limits robot manipulation capabilities by proposing a novel approach to transfer human demonstration knowledge to humanoid robots. Specifically, it fine-tunes the vision-language-action (VLA) model Ο€β‚€.β‚… using only readily available large-scale first-person human operation videos, enabling cross-agent (human-to-humanoid) task transfer and skill composition without any robot demonstration data. This study presents the first demonstration that a five-fingered dexterous hand can comprehend novel task semantics and reuse existing skills solely from human video data. Experimental results show that the proposed method significantly enhances the robot’s generalization and compositional manipulation abilities in zero-robot-data settings.
πŸ“ Abstract
Robotics faces a fundamental challenge of data scarcity. Unlike language or vision research, there is no internet-scale dataset for robotic manipulation. A promising path forward is to leverage egocentric human data, which can be collected more easily, with greater breadth, and at a larger scale. Towards this end, we investigate key design choices for learning across human and humanoid embodiments equipped with dexterous five-finger hands, using the $Ο€_{0.5}$ model as a foundation. Our results show that human data enables robots to learn new task semantics and compose existing skills into novel behaviors without corresponding robot data. The paper website is here: https://egopipaper.github.io/
Problem

Research questions and friction points this paper is trying to address.

data scarcity
robotic manipulation
egocentric human data
embodiment
task semantics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Ego-centric data
VLA fine-tuning
cross-embodiment learning
robotic manipulation
human-to-robot transfer
πŸ”Ž Similar Papers
No similar papers found.