LAR-MoE: Latent-Aligned Routing for Mixture of Experts in Robotic Imitation Learning

📅 2026-03-09

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of poor generalization in robotic imitation learning when deploying policies across heterogeneous tasks, often caused by averaging multimodal demonstrations. To overcome this, the authors propose LAR-MoE, a two-stage framework that first constructs a joint latent space between observations and future actions via teacher-student co-training, then leverages this latent representation to guide unsupervised expert routing in a Mixture-of-Experts (MoE) architecture for skill decomposition and efficient policy learning. Innovatively integrating latent space alignment into the MoE routing mechanism enables expert specialization without supervision, preventing collapse while maintaining parameter efficiency. Experiments demonstrate that LAR-MoE achieves a 95.2% average success rate on the LIBERO benchmark with only 150 million parameters and enables zero-shot transfer to ex vivo porcine tissue in a surgical bowel-grasping task, matching the performance of supervised MoE approaches.

Technology Category

Application Category

📝 Abstract

Imitation learning enables robots to acquire manipulation skills from demonstrations, yet deploying a policy across tasks with heterogeneous dynamics remains challenging, as models tend to average over distinct behavioral modes present in the demonstrations. Mixture-of-Experts (MoE) architectures address this by activating specialized subnetworks, but requires meaningful skill decompositions for expert routing. We introduce Latent-Aligned Routing for Mixture of Experts (LAR-MoE), a two-stage framework that decouples unsupervised skill discovery from policy learning. In pre-training, we learn a joint latent representation between observations and future actions through student-teacher co-training. In a post-training stage, the expert routing is regularized to follow the structure of the learned latent space, preventing expert collapse while maintaining parameter efficiency. We evaluate LAR-MoE in simulation and on hardware. On the LIBERO benchmark, our method achieves a 95.2% average success rate with 150M parameters. On a surgical bowel grasping and retraction task, LAR-MoE matches a supervised MoE baseline without requiring any phase annotations, and transfers zero-shot to ex vivo porcine tissue. Our findings suggest that latent-aligned routing provides a principled alternative to supervised skill decomposition, enabling structured expert specialization from unlabeled demonstrations.

Problem

Research questions and friction points this paper is trying to address.

Imitation Learning

Mixture of Experts

Skill Decomposition

Heterogeneous Dynamics

Expert Routing

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mixture-of-Experts

Latent-Aligned Routing

Imitation Learning

Unsupervised Skill Discovery

Expert Specialization

🔎 Similar Papers

No similar papers found.

Authors to Follow