From Pheromones to Policies: Reinforcement Learning for Engineered Biological Swarms

📅 2025-09-24

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Engineered biological swarms (e.g., nematode groups) suffer from positive-feedback locking due to pheromone trail persistence in dynamic environments, degrading adaptability. Method: We model pheromone-mediated aggregation as a distributed reinforcement learning process, establishing— for the first time—a mathematical equivalence between pheromone dynamics and cross-learning algorithms. We introduce “exploratory individuals” insensitive to pheromones to break path dependence. Integrating synthetic biology, swarm robotics, and multi-armed bandit simulations—and validated with empirical data—our model reproduces nematode foraging patterns in static settings and enables rapid task switching upon environmental perturbations. Contribution/Results: We uncover how environmental signals function as implicit, distributed rewards, revealing their regulatory principles. Moreover, we propose a transferable design paradigm for adaptive collective intelligence, significantly enhancing group-level decision-making flexibility and robustness under dynamism.

Technology Category

Application Category

📝 Abstract

Swarm intelligence emerges from decentralised interactions among simple agents, enabling collective problem-solving. This study establishes a theoretical equivalence between pheromone-mediated aggregation in celeg and reinforcement learning (RL), demonstrating how stigmergic signals function as distributed reward mechanisms. We model engineered nematode swarms performing foraging tasks, showing that pheromone dynamics mathematically mirror cross-learning updates, a fundamental RL algorithm. Experimental validation with data from literature confirms that our model accurately replicates empirical celeg foraging patterns under static conditions. In dynamic environments, persistent pheromone trails create positive feedback loops that hinder adaptation by locking swarms into obsolete choices. Through computational experiments in multi-armed bandit scenarios, we reveal that introducing a minority of exploratory agents insensitive to pheromones restores collective plasticity, enabling rapid task switching. This behavioural heterogeneity balances exploration-exploitation trade-offs, implementing swarm-level extinction of outdated strategies. Our results demonstrate that stigmergic systems inherently encode distributed RL processes, where environmental signals act as external memory for collective credit assignment. By bridging synthetic biology with swarm robotics, this work advances programmable living systems capable of resilient decision-making in volatile environments.

Problem

Research questions and friction points this paper is trying to address.

Modeling pheromone-mediated aggregation in nematodes as reinforcement learning processes

Addressing swarm adaptation failures in dynamic environments due to persistent pheromones

Developing heterogeneous swarm strategies for balancing exploration-exploitation trade-offs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Modeling pheromone dynamics as reinforcement learning updates

Introducing exploratory agents to restore swarm adaptability

Using environmental signals as external collective memory

🔎 Similar Papers

No similar papers found.

Authors to Follow