Characterizing Knowledge Graph Tasks in LLM Benchmarks Using Cognitive Complexity Frameworks

📅 2025-09-17

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing LLM-KG evaluation benchmarks overemphasize answer accuracy while neglecting systematic characterization of task-level cognitive complexity, leading to fragmented capability assessment, undetected blind spots, and insufficient task diversity. Method: This work introduces, for the first time, three established cognitive complexity frameworks from cognitive psychology into LLM-KG benchmark analysis, enabling multidimensional complexity modeling of knowledge graph (KG) tasks on LLM-KG-Bench. Contribution/Results: We uncover severe imbalances in cognitive demand distribution across current evaluations—particularly underrepresentation of higher-order reasoning and multi-step planning tasks. Our findings provide empirical grounding and a principled design paradigm for developing more interpretable, balanced, and challenging KG evaluation tasks. This advances KG evaluation from a purely outcome-oriented paradigm toward a dual-dimensional framework that jointly assesses both process-level reasoning and underlying cognitive capabilities.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) are increasingly used for tasks involving Knowledge Graphs (KGs), whose evaluation typically focuses on accuracy and output correctness. We propose a complementary task characterization approach using three complexity frameworks from cognitive psychology. Applying this to the LLM-KG-Bench framework, we highlight value distributions, identify underrepresented demands and motivate richer interpretation and diversity for benchmark evaluation tasks.

Problem

Research questions and friction points this paper is trying to address.

Characterizing KG task complexity using cognitive psychology frameworks

Evaluating LLM performance beyond accuracy and correctness metrics

Identifying underrepresented cognitive demands in KG benchmark tasks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using cognitive complexity frameworks for task characterization

Applying frameworks to LLM-KG-Bench for analysis

Identifying underrepresented demands in benchmark evaluations

🔎 Similar Papers

No similar papers found.

Authors to Follow