🤖 AI Summary
In sparse-reward settings, agents struggle to autonomously discover and exploit task-relevant, structured inter-object relationships for learning generalizable relational policies.
Method: We propose a novel Relational Reinforcement Learning (RRL) framework that tightly integrates symbolic function approximators with RRL, enabling joint, incremental selection of relational representations and co-evolution of policies. Using Atari environments (Breakout, Pong, Demon Attack), we design relation-aware state encoding and policy learning mechanisms—without handcrafted relational priors.
Contribution/Results: The agent automatically identifies task-critical relations and achieves performance on par with strong baselines. Crucially, the learned relational representations exhibit cross-task transferability, supporting zero-shot adaptation to new tasks. This provides an interpretable, generalizable modeling pathway toward human-like, structure-aware decision-making—marking the first work to unify symbolic function approximation with RRL for emergent relational abstraction.
📝 Abstract
Humans perceive the world in terms of objects and relations between them. In fact, for any given pair of objects, there is a myriad of relations that apply to them. How does the cognitive system learn which relations are useful to characterize the task at hand? And how can it use these representations to build a relational policy to interact effectively with the environment? In this paper we propose that this problem can be understood through the lens of a sub-field of symbolic machine learning called relational reinforcement learning (RRL). To demonstrate the potential of our approach, we build a simple model of relational policy learning based on a function approximator developed in RRL. We trained and tested our model in three Atari games that required to consider an increasingly number of potential relations: Breakout, Pong and Demon Attack. In each game, our model was able to select adequate relational representations and build a relational policy incrementally. We discuss the relationship between our model with models of relational and analogical reasoning, as well as its limitations and future directions of research.