Deep Active Inference Agents for Delayed and Long-Horizon Environments

📅 2025-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Active Inference Framework (AIF) agents suffer from inaccurate prediction, inefficient planning, and reliance on hand-crafted rewards in industrial settings characterized by delayed feedback and hundred-step horizons. Method: We propose an end-to-end probabilistic agent framework that integrates a differentiable world model with the active inference paradigm. Our approach introduces three key innovations: (i) multi-step latent transition prediction, (ii) joint optimization of generative policies, and (iii) single-step gradient-driven long-horizon planning. The world model is implemented via a variational autoencoder, trained through alternating optimization and backpropagation of expected free energy gradients. Results: Experiments in simulated industrial environments with significant action–observation delays demonstrate substantial improvements over baselines. Our method achieves robust long-horizon control without hand-engineered rewards and with low computational overhead; planning efficiency improves by two orders of magnitude. To our knowledge, this is the first successful application of AIF to high-complexity, delayed, long-horizon industrial control tasks.

Technology Category

Application Category

📝 Abstract
With the recent success of world-model agents, which extend the core idea of model-based reinforcement learning by learning a differentiable model for sample-efficient control across diverse tasks, active inference (AIF) offers a complementary, neuroscience-grounded paradigm that unifies perception, learning, and action within a single probabilistic framework powered by a generative model. Despite this promise, practical AIF agents still rely on accurate immediate predictions and exhaustive planning, a limitation that is exacerbated in delayed environments requiring plans over long horizons, tens to hundreds of steps. Moreover, most existing agents are evaluated on robotic or vision benchmarks which, while natural for biological agents, fall short of real-world industrial complexity. We address these limitations with a generative-policy architecture featuring (i) a multi-step latent transition that lets the generative model predict an entire horizon in a single look-ahead, (ii) an integrated policy network that enables the transition and receives gradients of the expected free energy, (iii) an alternating optimization scheme that updates model and policy from a replay buffer, and (iv) a single gradient step that plans over long horizons, eliminating exhaustive planning from the control loop. We evaluate our agent in an environment that mimics a realistic industrial scenario with delayed and long-horizon settings. The empirical results confirm the effectiveness of the proposed approach, demonstrating the coupled world-model with the AIF formalism yields an end-to-end probabilistic controller capable of effective decision making in delayed, long-horizon settings without handcrafted rewards or expensive planning.
Problem

Research questions and friction points this paper is trying to address.

Overcoming reliance on immediate predictions in delayed environments
Addressing limitations in long-horizon planning for AIF agents
Enhancing real-world industrial applicability of neuroscience-grounded AIF
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-step latent transition for horizon prediction
Integrated policy network with gradient optimization
Single gradient step for long-horizon planning