AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models

📅 2026-03-18

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work proposes AcceRL, a novel framework that addresses the computational inefficiency and high data demands of large-scale vision-language-action (VLA) models in reinforcement learning. AcceRL introduces, for the first time, a pluggable and trainable world model within a distributed asynchronous reinforcement learning setting. By physically decoupling training, inference, and environment interaction, the framework generates synthetic experiences to dramatically improve sample efficiency. This design overcomes the synchronization bottleneck inherent in conventional approaches, achieving state-of-the-art performance on the LIBERO benchmark. At the algorithmic level, AcceRL significantly enhances training stability and sample efficiency; at the system level, it enables superlinear throughput scaling and high hardware utilization, demonstrating both methodological and engineering advances.

Technology Category

Application Category

📝 Abstract

Reinforcement learning (RL) for large-scale Vision-Language-Action (VLA) models faces significant challenges in computational efficiency and data acquisition. We propose AcceRL, a fully asynchronous and decoupled RL framework designed to eliminate synchronization barriers by physically isolating training, inference, and rollouts. Crucially, AcceRL is the first to integrate a plug-and-play, trainable world model into a distributed asynchronous RL pipeline to generate virtual experiences. Experiments on the LIBERO benchmark demonstrate that AcceRL achieves state-of-the-art (SOTA) performance. Systematically, it exhibits super-linear scaling in throughput and highly efficient hardware utilization. Algorithmically, the world-model-augmented variant delivers unprecedented sample efficiency and robust training stability in complex control tasks.

Problem

Research questions and friction points this paper is trying to address.

Reinforcement Learning

Vision-Language-Action Models

Computational Efficiency

Data Acquisition

World Model

Innovation

Methods, ideas, or system contributions that make the work stand out.

asynchronous reinforcement learning

world model

Vision-Language-Action models

distributed RL

sample efficiency

🔎 Similar Papers

Mutual Enhancement of Large Language and Reinforcement Learning Models through Bi-Directional Feedback Mechanisms: A Case Study

2024-01-12arXiv.orgCitations: 0

Authors to Follow