LLM Trainer: Automated Robotic Data Generating via Demonstration Augmentation using LLMs

๐Ÿ“… 2025-09-24
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the challenge of scarce human demonstration data in robotic imitation learning by proposing a fully automated framework that synthesizes large-scale, high-quality trajectory datasets from a single demonstration. Methodologically, it integrates the world knowledge and forward task-planning capabilities of large language models with keyframe extraction, object-pose relational modeling, online key-pose retargeting, and feedback control to establish a reusable offline annotation mechanism. Furthermore, it introduces a Thompson samplingโ€“driven iterative optimization strategy to enhance trajectory generation success rates and cross-task generalization. Experiments demonstrate significant performance gains over expert-designed baselines across multiple manipulation tasks; hardware deployment is successfully validated on a Franka Emika Panda robot. The approach establishes a scalable, reproducible paradigm for few-shot robotic learning.

Technology Category

Application Category

๐Ÿ“ Abstract
We present LLM Trainer, a fully automated pipeline that leverages the world knowledge of Large Language Models (LLMs) to transform a small number of human demonstrations (as few as one) into a large robot dataset for imitation learning. Our approach decomposes demonstration generation into two steps: (1) offline demonstration annotation that extracts keyframes, salient objects, and pose-object relations; and (2) online keypose retargeting that adapts those keyframes to a new scene, given an initial observation. Using these modified keypoints, our system warps the original demonstration to generate a new trajectory, which is then executed, and the resulting demo, if successful, is saved. Because the annotation is reusable across scenes, we use Thompson sampling to optimize the annotation, significantly improving generation success rate. We evaluate our method on a range of tasks, and find that our data annotation method consistently outperforms expert-engineered baselines. We further show an ensemble policy that combines the optimized LLM feed-forward plan with a learned feedback imitation learning controller. Finally, we demonstrate hardware feasibility on a Franka Emika Panda robot. For additional materials and demonstration videos, please see the project website: https://sites.google.com/andrew.cmu.edu/llm-trainer
Problem

Research questions and friction points this paper is trying to address.

Automating robotic dataset generation from minimal human demonstrations
Transforming few demonstrations into large datasets via LLM augmentation
Improving imitation learning success rates through optimized annotation reuse
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated pipeline transforms human demonstrations into robot datasets
Decomposes generation into offline annotation and online retargeting
Uses Thompson sampling to optimize reusable annotation across scenes
๐Ÿ”Ž Similar Papers
No similar papers found.