🤖 AI Summary
Engineered biological swarms (e.g., nematode groups) suffer from positive-feedback locking due to pheromone trail persistence in dynamic environments, degrading adaptability. Method: We model pheromone-mediated aggregation as a distributed reinforcement learning process, establishing— for the first time—a mathematical equivalence between pheromone dynamics and cross-learning algorithms. We introduce “exploratory individuals” insensitive to pheromones to break path dependence. Integrating synthetic biology, swarm robotics, and multi-armed bandit simulations—and validated with empirical data—our model reproduces nematode foraging patterns in static settings and enables rapid task switching upon environmental perturbations. Contribution/Results: We uncover how environmental signals function as implicit, distributed rewards, revealing their regulatory principles. Moreover, we propose a transferable design paradigm for adaptive collective intelligence, significantly enhancing group-level decision-making flexibility and robustness under dynamism.
📝 Abstract
Swarm intelligence emerges from decentralised interactions among simple agents, enabling collective problem-solving. This study establishes a theoretical equivalence between pheromone-mediated aggregation in celeg and reinforcement learning (RL), demonstrating how stigmergic signals function as distributed reward mechanisms. We model engineered nematode swarms performing foraging tasks, showing that pheromone dynamics mathematically mirror cross-learning updates, a fundamental RL algorithm. Experimental validation with data from literature confirms that our model accurately replicates empirical celeg foraging patterns under static conditions. In dynamic environments, persistent pheromone trails create positive feedback loops that hinder adaptation by locking swarms into obsolete choices. Through computational experiments in multi-armed bandit scenarios, we reveal that introducing a minority of exploratory agents insensitive to pheromones restores collective plasticity, enabling rapid task switching. This behavioural heterogeneity balances exploration-exploitation trade-offs, implementing swarm-level extinction of outdated strategies. Our results demonstrate that stigmergic systems inherently encode distributed RL processes, where environmental signals act as external memory for collective credit assignment. By bridging synthetic biology with swarm robotics, this work advances programmable living systems capable of resilient decision-making in volatile environments.