Learning Physical Interaction Skills from Human Demonstrations

📅 2025-07-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of enabling morphologically dissimilar agents to learn full-body physical interaction skills—such as dancing, handshaking, and martial arts—from human demonstrations. We propose the Embedded Interaction Graph (EIG), a compact, transferable spatiotemporal representation of interaction dynamics, which enables cross-morphology, target-free, and alignment-free physical interaction imitation for the first time. Our method integrates graph neural networks with physics-based simulation, using the EIG as a unified imitation objective for policy learning. Experiments across diverse robotic platforms demonstrate successful reproduction of complex interactive tasks—including dancing, rock-paper-scissors, and handshaking—while preserving both semantic motion fidelity and physical feasibility. The approach significantly extends the applicability boundary of imitation learning to heterogeneous embodied agents, overcoming longstanding limitations in morphological generalization, manual task specification, and inter-agent kinematic alignment.

Technology Category

Application Category

📝 Abstract
Learning physical interaction skills, such as dancing, handshaking, or sparring, remains a fundamental challenge for agents operating in human environments, particularly when the agent's morphology differs significantly from that of the demonstrator. Existing approaches often rely on handcrafted objectives or morphological similarity, limiting their capacity for generalization. Here, we introduce a framework that enables agents with diverse embodiments to learn wholebbody interaction behaviors directly from human demonstrations. The framework extracts a compact, transferable representation of interaction dynamics, called the Embedded Interaction Graph (EIG), which captures key spatiotemporal relationships between the interacting agents. This graph is then used as an imitation objective to train control policies in physics-based simulations, allowing the agent to generate motions that are both semantically meaningful and physically feasible. We demonstrate BuddyImitation on multiple agents, such as humans, quadrupedal robots with manipulators, or mobile manipulators and various interaction scenarios, including sparring, handshaking, rock-paper-scissors, or dancing. Our results demonstrate a promising path toward coordinated behaviors across morphologically distinct characters via cross embodiment interaction learning.
Problem

Research questions and friction points this paper is trying to address.

Learning physical interaction skills for diverse agent morphologies
Overcoming limitations of handcrafted objectives in imitation learning
Enabling cross-embodiment interaction via transferable dynamics representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Framework for diverse agents learning from humans
Embedded Interaction Graph captures interaction dynamics
Physics-based simulations train control policies
🔎 Similar Papers
No similar papers found.
T
Tianyu Li
Georgia Institute of Technology, Atlanta, 30308, GA, USA.
H
Hengbo Ma
Georgia Institute of Technology, Atlanta, 30308, GA, USA.
Sehoon Ha
Sehoon Ha
Georgia Institute of Technology
roboticscomputer graphicsmachine learning
Kwonjoon Lee
Kwonjoon Lee
Honda Research Institute USA
Machine LearningComputer VisionVision & Language