From Pheromones to Policies: Reinforcement Learning for Engineered Biological Swarms

📅 2025-09-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Engineered biological swarms (e.g., nematode groups) suffer from positive-feedback locking due to pheromone trail persistence in dynamic environments, degrading adaptability. Method: We model pheromone-mediated aggregation as a distributed reinforcement learning process, establishing— for the first time—a mathematical equivalence between pheromone dynamics and cross-learning algorithms. We introduce “exploratory individuals” insensitive to pheromones to break path dependence. Integrating synthetic biology, swarm robotics, and multi-armed bandit simulations—and validated with empirical data—our model reproduces nematode foraging patterns in static settings and enables rapid task switching upon environmental perturbations. Contribution/Results: We uncover how environmental signals function as implicit, distributed rewards, revealing their regulatory principles. Moreover, we propose a transferable design paradigm for adaptive collective intelligence, significantly enhancing group-level decision-making flexibility and robustness under dynamism.

Technology Category

Application Category

📝 Abstract
Swarm intelligence emerges from decentralised interactions among simple agents, enabling collective problem-solving. This study establishes a theoretical equivalence between pheromone-mediated aggregation in celeg and reinforcement learning (RL), demonstrating how stigmergic signals function as distributed reward mechanisms. We model engineered nematode swarms performing foraging tasks, showing that pheromone dynamics mathematically mirror cross-learning updates, a fundamental RL algorithm. Experimental validation with data from literature confirms that our model accurately replicates empirical celeg foraging patterns under static conditions. In dynamic environments, persistent pheromone trails create positive feedback loops that hinder adaptation by locking swarms into obsolete choices. Through computational experiments in multi-armed bandit scenarios, we reveal that introducing a minority of exploratory agents insensitive to pheromones restores collective plasticity, enabling rapid task switching. This behavioural heterogeneity balances exploration-exploitation trade-offs, implementing swarm-level extinction of outdated strategies. Our results demonstrate that stigmergic systems inherently encode distributed RL processes, where environmental signals act as external memory for collective credit assignment. By bridging synthetic biology with swarm robotics, this work advances programmable living systems capable of resilient decision-making in volatile environments.
Problem

Research questions and friction points this paper is trying to address.

Modeling pheromone-mediated aggregation in nematodes as reinforcement learning processes
Addressing swarm adaptation failures in dynamic environments due to persistent pheromones
Developing heterogeneous swarm strategies for balancing exploration-exploitation trade-offs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modeling pheromone dynamics as reinforcement learning updates
Introducing exploratory agents to restore swarm adaptability
Using environmental signals as external collective memory
🔎 Similar Papers
No similar papers found.
A
Aymeric Vellinger
Department of Computer Science, University of Namur, Namur, Belgium
N
Nemanja Antonic
Department of Computer Science, University of Namur, Namur, Belgium
Elio Tuci
Elio Tuci
University of Namur
Evolutionary RoboticsSwarm RoboticsArtifcial Life