FSL-LVLM: Friction-Aware Safety Locomotion using Large Vision Language Model in Wheeled Robots

๐Ÿ“… 2024-09-15
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Legged-wheel robots suffer from instability on slippery terrain due to insufficient traction, necessitating real-time friction estimation for feedforward motion control. To address this, we propose the first friction-aware framework integrating Large Vision-Language Models (LVLMs) with Reinforcement Learning (RL). Specifically, we design a Friction-From-Vision module that leverages LVLMsโ€™ zero-shot and few-shot capabilities to directly estimate ground friction coefficients from single RGB images. Crucially, we explicitly embed the estimated friction coefficient into the policy networks of PPO and SACโ€”enabling terrain-adaptive control without modifying reward structures or dynamics models. Evaluated on a wheeled inverted pendulum platform, our method significantly improves task success rate and reduces trajectory tracking error by 32% compared to baseline RL policies. Moreover, the framework exhibits plug-and-play compatibility, seamlessly integrating with diverse on-policy and off-policy RL algorithms without architectural retraining.

Technology Category

Application Category

๐Ÿ“ Abstract
Wheeled-legged robots offer significant mobility and versatility but face substantial challenges when operating on slippery terrains. Traditional model-based controllers for these robots assume no slipping. While reinforcement learning (RL) helps quadruped robots adapt to different surfaces, recovering from slips remains challenging, especially for systems with few contact points. Estimating the ground friction coefficient is another open challenge. In this paper, we propose a novel friction-aware safety locomotion framework that integrates Large Vision Language Models (LVLMs) with a RL policy. Our approach explicitly incorporates the estimated friction coefficient into the RL policy, enabling the robot to adapt its behavior in advance based on the surface type before reaching it. We introduce a Friction-From-Vision (FFV) module, which leverages LVLMs to estimate ground friction coefficients, eliminating the need for large datasets and extensive training. The framework was validated on a customized wheeled inverted pendulum, and experimental results demonstrate that our framework increases the success rate in completing driving tasks by adjusting speed according to terrain type, while achieving better tracking performance compared to baseline methods. Our framework can be simply integrated with any other RL policies.
Problem

Research questions and friction points this paper is trying to address.

Control wheeled-legged robots on slippery surfaces
Predict ground friction before contact for stability
Integrate vision-language models with reinforcement learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

VLMs estimate friction for proactive control
RL policy adapts speed using friction data
RAG enhances CoF prediction accuracy
๐Ÿ”Ž Similar Papers
No similar papers found.
B
Bo Peng
Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign
D
D. Baek
Department of Mechanical Science Engineering, University of Illinois at Urbana-Champaign
Qijie Wang
Qijie Wang
School of Software, Tsinghua University
J
Joao Ramos
Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign and Department of Mechanical Science Engineering, University of Illinois at Urbana-Champaign