Discrete Codebook World Models for Continuous Control

📅 2025-03-01

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the underexplored application of discrete latent-space models to state-based continuous control. We propose the Discrete Codebook World Model (DCWM), which employs vector quantization (VQ) to construct a discrete stochastic codebook, representing environment latent states via codebook indices, and incorporates a discrete hidden Markov prior to capture temporal structure. Complementing DCWM, we introduce DC-MPC—a model-predictive control algorithm tailored for discrete latent dynamics. Our experiments provide the first empirical evidence that, in state-based continuous control, discrete codebook encoding substantially outperforms continuous latent variables and one-hot alternatives across long-horizon prediction accuracy, policy stability, and sample efficiency. Moreover, DC-MPC achieves performance on par with state-of-the-art methods—including TD-MPC2 and DreamerV3—on standard benchmarks. These results demonstrate that discrete latent representations are not only effective but also robust for modeling and planning in continuous control tasks.

Technology Category

Application Category

📝 Abstract

In reinforcement learning (RL), world models serve as internal simulators, enabling agents to predict environment dynamics and future outcomes in order to make informed decisions. While previous approaches leveraging discrete latent spaces, such as DreamerV3, have demonstrated strong performance in discrete action settings and visual control tasks, their comparative performance in state-based continuous control remains underexplored. In contrast, methods with continuous latent spaces, such as TD-MPC2, have shown notable success in state-based continuous control benchmarks. In this paper, we demonstrate that modeling discrete latent states has benefits over continuous latent states and that discrete codebook encodings are more effective representations for continuous control, compared to alternative encodings, such as one-hot and label-based encodings. Based on these insights, we introduce DCWM: Discrete Codebook World Model, a self-supervised world model with a discrete and stochastic latent space, where latent states are codes from a codebook. We combine DCWM with decision-time planning to get our model-based RL algorithm, named DC-MPC: Discrete Codebook Model Predictive Control, which performs competitively against recent state-of-the-art algorithms, including TD-MPC2 and DreamerV3, on continuous control benchmarks. See our project website www.aidanscannell.com/dcmpc.

Problem

Research questions and friction points this paper is trying to address.

Explores discrete latent states' benefits in continuous control.

Compares discrete codebook encodings with other encoding methods.

Introduces DCWM and DC-MPC for model-based RL in continuous control.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Discrete Codebook World Model (DCWM) introduced

DCWM combined with decision-time planning

DC-MPC outperforms TD-MPC2 and DreamerV3

🔎 Similar Papers

PWM: Policy Learning with Multi-Task World Models