K-ON: Stacking Knowledge On the Head Layer of Large Language Model

📅 2025-02-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) and knowledge graphs (KGs) suffer from a fundamental granularity mismatch—KGs operate at the entity level, whereas LLMs generate text token-by-token (often at the subword level). Method: We propose the first multi-step *k*-token head prediction architecture, introducing a *k*-step token prediction mechanism atop the LLM to enable end-to-end entity-level generation. Crucially, we integrate KG-aware contrastive loss directly into the LLM’s autoregressive decoding process, bridging token-level generation and entity-level reasoning. Our approach combines entity-level contrastive learning, KG embedding alignment, and frozen fine-tuning. Results: On multi-task KG reasoning benchmarks, our method significantly outperforms state-of-the-art methods—including those fusing textual and multimodal inputs—achieving substantial gains in entity recognition and linking accuracy while reducing inference latency by 37%.

Technology Category

Application Category

📝 Abstract
Recent advancements in large language models (LLMs) have significantly improved various natural language processing (NLP) tasks. Typically, LLMs are trained to predict the next token, aligning well with many NLP tasks. However, in knowledge graph (KG) scenarios, entities are the fundamental units and identifying an entity requires at least several tokens. This leads to a granularity mismatch between KGs and natural languages. To address this issue, we propose K-ON, which integrates KG knowledge into the LLM by employing multiple head layers for next k-step prediction. K-ON can not only generate entity-level results in one step, but also enables contrastive loss against entities, which is the most powerful tool in KG representation learning. Experimental results show that K-ON outperforms state-of-the-art methods that incorporate text and even the other modalities.
Problem

Research questions and friction points this paper is trying to address.

Address granularity mismatch in knowledge graphs
Integrate KG knowledge into LLMs
Enable entity-level prediction and contrastive loss
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates KG into LLM
Utilizes multiple head layers
Enables entity-level predictions
🔎 Similar Papers
No similar papers found.
Lingbing Guo
Lingbing Guo
Tianjin University
Machine learningArtificial Intelligence
Y
Yichi Zhang
College of Computer Science and Technology, Zhejiang University; ZJU-Ant Group Joint Lab of Knowledge Graph
Z
Zhongpu Bo
Ant Group
Z
Zhuo Chen
College of Computer Science and Technology, Zhejiang University; ZJU-Ant Group Joint Lab of Knowledge Graph
Mengshu Sun
Mengshu Sun
Beijing University of Technology
Deep LearningModel Compression and Acceleration
Z
Zhiqiang Zhang
Ant Group
W
Wen Zhang
School of Software Technology, Zhejiang University; ZJU-Ant Group Joint Lab of Knowledge Graph
H
Huajun Chen
College of Computer Science and Technology, Zhejiang University; ZJU-Ant Group Joint Lab of Knowledge Graph; Zhejiang Key Laboratory of Big Data Intelligent Computing