Unlearning of Knowledge Graph Embedding via Preference Optimization

📅 2025-07-28

📈 Citations: 0

✨ Influential: 0

career value

157K/year

🤖 AI Summary

Efficiently forgetting outdated or erroneous knowledge in Knowledge Graph Embedding (KGE) models faces two key challenges: exact unlearning incurs prohibitive computational overhead, while approximate unlearning suffers from incomplete forgetting—due to graph structural connectivity—resulting in residual inferability of target facts and unintended weakening of retained knowledge within the forgetting boundary. To address this, we propose GraphDPO, a preference-optimization-based approximate unlearning framework that formalizes forgetting as a triplet preference learning problem. GraphDPO introduces three novel components: (i) a boundary recall mechanism to preserve intra-boundary semantics, (ii) cross-timestep knowledge distillation to stabilize temporal consistency, and (iii) an out-of-boundary sampling strategy enforcing low semantic overlap to minimize interference. Crucially, it weakens associations between target and retained facts by reconstructing substitute triplets. Evaluated on eight forgetting benchmarks derived from four major knowledge graphs, GraphDPO achieves average MRR improvements of 10.1% (MRR_Avg) and 14.0% (MRR_F1) over state-of-the-art methods.

Technology Category

Application Category

📝 Abstract

Existing knowledge graphs (KGs) inevitably contain outdated or erroneous knowledge that needs to be removed from knowledge graph embedding (KGE) models. To address this challenge, knowledge unlearning can be applied to eliminate specific information while preserving the integrity of the remaining knowledge in KGs. Existing unlearning methods can generally be categorized into exact unlearning and approximate unlearning. However, exact unlearning requires high training costs while approximate unlearning faces two issues when applied to KGs due to the inherent connectivity of triples: (1) It fails to fully remove targeted information, as forgetting triples can still be inferred from remaining ones. (2) It focuses on local data for specific removal, which weakens the remaining knowledge in the forgetting boundary. To address these issues, we propose GraphDPO, a novel approximate unlearning framework based on direct preference optimization (DPO). Firstly, to effectively remove forgetting triples, we reframe unlearning as a preference optimization problem, where the model is trained by DPO to prefer reconstructed alternatives over the original forgetting triples. This formulation penalizes reliance on forgettable knowledge, mitigating incomplete forgetting caused by KG connectivity. Moreover, we introduce an out-boundary sampling strategy to construct preference pairs with minimal semantic overlap, weakening the connection between forgetting and retained knowledge. Secondly, to preserve boundary knowledge, we introduce a boundary recall mechanism that replays and distills relevant information both within and across time steps. We construct eight unlearning datasets across four popular KGs with varying unlearning rates. Experiments show that GraphDPO outperforms state-of-the-art baselines by up to 10.1% in MRR_Avg and 14.0% in MRR_F1.

Problem

Research questions and friction points this paper is trying to address.

Remove outdated or erroneous knowledge from KG embeddings

Address incomplete forgetting due to KG connectivity

Preserve boundary knowledge during unlearning process

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Direct Preference Optimization for unlearning

Introduces out-boundary sampling strategy

Implements boundary recall mechanism

🔎 Similar Papers

The Role of Graph Topology in the Performance of Biomedical Knowledge Graph Completion Models