🤖 AI Summary
This work addresses the challenge of autonomous robotic scooping of multiphase mixtures (liquids, granular materials, and solids). We propose the first diffusion-model-based Sim2Real generative policy framework for this task. Implemented in the NVIDIA Omniverse-driven OmniGibson simulator, our approach integrates privileged-state-guided demonstration generation, observation-driven diffusion policy learning, and domain randomization to enable zero-shot sim-to-real transfer. Our key contribution is the first application of diffusion models to high-dimensional, nonlinear tool–object interaction modeling—enabling generalization across diverse object categories, container geometries, and quantities. Evaluated on 465 real-world trials involving Level 1 and Level 2 difficulty objects, our method significantly outperforms all baselines and ablation variants, demonstrating strong generalization capability and practical deployability.
📝 Abstract
Scooping items with tools such as spoons and ladles is common in daily life, ranging from assistive feeding to retrieving items from environmental disaster sites. However, developing a general and autonomous robotic scooping policy is challenging since it requires reasoning about complex tool-object interactions. Furthermore, scooping often involves manipulating deformable objects, such as granular media or liquids, which is challenging due to their infinite-dimensional configuration spaces and complex dynamics. We propose a method, SCOOP'D, which uses simulation from OmniGibson (built on NVIDIA Omniverse) to collect scooping demonstrations using algorithmic procedures that rely on privileged state information. Then, we use generative policies via diffusion to imitate demonstrations from observational input. We directly apply the learned policy in diverse real-world scenarios, testing its performance on various item quantities, item characteristics, and container types. In zero-shot deployment, our method demonstrates promising results across 465 trials in diverse scenarios, including objects of different difficulty levels that we categorize as "Level 1" and "Level 2." SCOOP'D outperforms all baselines and ablations, suggesting that this is a promising approach to acquiring robotic scooping skills. Project page is at https://scoopdiff.github.io/.