🤖 AI Summary
Joint optimization of robot morphology and control policies has long been constrained by fixed reward functions, leading to suboptimal convergence and limited emergence of morphology-adapted diverse locomotion behaviors. This paper introduces the first large language model (LLM)-driven morphology–reward co-optimization framework, eliminating task-specific prompts and predefined templates. It achieves end-to-end joint optimization via a two-stage mechanism: (i) LLM-based generation of high-diversity, high-quality morphology–reward pairs; and (ii) gradient-guided alternating fine-tuning. The method integrates LLM reasoning, reward shaping, parametric morphological modeling, and differentiable optimization. Evaluated on eight canonical locomotion tasks, our framework consistently outperforms human-designed solutions and state-of-the-art methods, autonomously synthesizing high-performance morphologies alongside their dedicated control policies.
📝 Abstract
Robot co-design, jointly optimizing morphology and control policy, remains a longstanding challenge in the robotics community, where many promising robots have been developed. However, a key limitation lies in its tendency to converge to sub-optimal designs due to the use of fixed reward functions, which fail to explore the diverse motion modes suitable for different morphologies. Here we propose RoboMoRe, a large language model (LLM)-driven framework that integrates morphology and reward shaping for co-optimization within the robot co-design loop. RoboMoRe performs a dual-stage optimization: in the coarse optimization stage, an LLM-based diversity reflection mechanism generates both diverse and high-quality morphology-reward pairs and efficiently explores their distribution. In the fine optimization stage, top candidates are iteratively refined through alternating LLM-guided reward and morphology gradient updates. RoboMoRe can optimize both efficient robot morphologies and their suited motion behaviors through reward shaping. Results demonstrate that without any task-specific prompting or predefined reward/morphology templates, RoboMoRe significantly outperforms human-engineered designs and competing methods across eight different tasks.