Knowledge Graph Reasoning With Self-Supervised Reinforcement Learning

📅 2024-05-22

🏛️ IEEE Transactions on Audio, Speech, and Language Processing

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

In incomplete knowledge graph (KG) multi-hop reasoning, existing reinforcement learning (RL)-based approaches suffer from inefficient exploration and poor generalization due to the large action space and policy–environment distribution mismatch. To address this, we propose a self-supervised reinforcement learning (SSRL) framework: first, a policy network is pre-trained via self-generated high-quality pseudo-labels to mitigate distribution shift; then, RL fine-tunes the pre-trained model for optimal path search. SSRL introduces the first synergistic training paradigm integrating self-supervision and RL, enabling plug-and-play adaptation to mainstream RL-based KG reasoning (KGR) models—including MINERVA and MultiHopKG. Extensive experiments on four standard benchmarks (FB15k-237, WN18RR, UMLS, and Kinship) demonstrate consistent superiority over state-of-the-art methods, with significant improvements in Hits@1/3/10 and MRR. Results validate SSRL’s generality, robustness, and cross-dataset transferability.

Technology Category

Application Category

📝 Abstract

Reinforcement learning (RL) is an effective method of finding reasoning pathways in incomplete knowledge graphs (KGs). To overcome the challenges of a large action space, a self-supervised pre-training method is proposed to warm up the policy network before the RL training stage. To alleviate the distributional mismatch issue in general self-supervised RL (SSRL), in our supervised learning (SL) stage, the agent selects actions based on the policy network and learns from generated labels; this self-generation of labels is the intuition behind the name self-supervised. With this training framework, the information density of our SL objective is increased and the agent is prevented from getting stuck with the early rewarded paths. Our self-supervised RL (SSRL) method improves the performance of RL by pairing it with the wide coverage achieved by SL during pretraining, since the breadth of the SL objective makes it infeasible to train an agent with that alone. We show that our SSRL model meets or exceeds current state-of-the-art results on all Hits@k and mean reciprocal rank (MRR) metrics on four large benchmark KG datasets. This SSRL method can be used as a plug-in for any RL architecture for a KGR task. We adopt two RL architectures, i.e., MINERVA and MultiHopKG as our baseline RL models and experimentally show that our SSRL model consistently outperforms both baselines on all of these four KG reasoning tasks.

Problem

Research questions and friction points this paper is trying to address.

Overcoming large action space in KG reasoning with self-supervised RL

Addressing distributional mismatch in self-supervised reinforcement learning

Improving KG reasoning performance via SL and RL integration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised pre-training for policy network

Self-generated labels to enhance learning

Combines reinforcement and supervised learning

🔎 Similar Papers

No similar papers found.