Efficient Reinforcement Learning Through Adaptively Pretrained Visual Encoder

📅 2025-02-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In visual reinforcement learning, existing visual encoders exhibit poor generalization and limited cross-environment transferability. To address this, we propose the Adaptive Pre-trained Encoder (APE) framework: during pre-training, it introduces dynamic data augmentation and a joint contrastive-reconstruction objective, enabling the first adaptive selection of pre-training duration and policy—departing from rigid, fixed paradigms. A lightweight fine-tuning mechanism is further designed to seamlessly integrate with leading RL algorithms such as DreamerV3 and DrQ-v2. APE significantly enhances cross-task generalization of visual representations, achieving state-of-the-art performance on DeepMind Control Suite, Atari, and Memory Maze. Moreover, under purely visual input, its sample efficiency approaches that of state-based methods, substantially reducing environmental interaction requirements.

Technology Category

Application Category

📝 Abstract
While Reinforcement Learning (RL) agents can successfully learn to handle complex tasks, effectively generalizing acquired skills to unfamiliar settings remains a challenge. One of the reasons behind this is the visual encoders used are task-dependent, preventing effective feature extraction in different settings. To address this issue, recent studies have tried to pretrain encoders with diverse visual inputs in order to improve their performance. However, they rely on existing pretrained encoders without further exploring the impact of pretraining period. In this work, we propose APE: efficient reinforcement learning through Adaptively Pretrained visual Encoder -- a framework that utilizes adaptive augmentation strategy during the pretraining phase and extracts generalizable features with only a few interactions within the task environments in the policy learning period. Experiments are conducted across various domains, including DeepMind Control Suite, Atari Games and Memory Maze benchmarks, to verify the effectiveness of our method. Results show that mainstream RL methods, such as DreamerV3 and DrQ-v2, achieve state-of-the-art performance when equipped with APE. In addition, APE significantly improves the sampling efficiency using only visual inputs during learning, approaching the efficiency of state-based method in several control tasks. These findings demonstrate the potential of adaptive pretraining of encoder in enhancing the generalization ability and efficiency of visual RL algorithms.
Problem

Research questions and friction points this paper is trying to address.

Enhancing generalization in Reinforcement Learning
Improving visual encoder pretraining efficiency
Adapting visual inputs for diverse task environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Pretrained visual Encoder
Efficient feature extraction
Improved sampling efficiency
🔎 Similar Papers
No similar papers found.
Y
Yuhan Zhang
Laboratory of Brain Atlas and Brain-inspired Intelligence, Institute of Automation, Chinese Academy of Sciences; School of Artificial Intelligence, University of Chinese Academy of Sciences
G
Guoqing Ma
Laboratory of Brain Atlas and Brain-inspired Intelligence, Institute of Automation, Chinese Academy of Sciences; School of Future Technology, University of Chinese Academy of Sciences
Guangfu Hao
Guangfu Hao
Laboratory of Brain Atlas and Brain-inspired Intelligence, Institute of Automation, CAS
Computational NeuroscienceBrain-Inspired Neural NetworksLarge Language ModelsCognitive Models
Liangxuan Guo
Liangxuan Guo
Institute of Automation, Chinese Academy of Sciences
Brain-inspired ComputingArtificial Intelligence
Y
Yang Chen
Key Laboratory of Brain Cognition and Brain-inspired Intelligence Technology, Chinese Academy of Sciences
S
Shan Yu
Laboratory of Brain Atlas and Brain-inspired Intelligence, Institute of Automation, Chinese Academy of Sciences; School of Future Technology, University of Chinese Academy of Sciences; Key Laboratory of Brain Cognition and Brain-inspired Intelligence Technology, Chinese Academy of Sciences