Can LLMs Estimate Cognitive Complexity of Reading Comprehension Items?

📅 2025-10-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the capability of large language models (LLMs) to assess the cognitive complexity of reading comprehension items, focusing on two core dimensions: breadth of evidentiary scope and depth of information transformation. Method: Departing from conventional NLP approaches, it pioneers the use of LLMs to model implicit cognitive features—traditionally difficult to extract explicitly—from human reasoning processes. Leveraging structured prompt engineering and qualitative analysis, the method enables fine-grained characterization of latent inferential load embedded in items. Contribution/Results: Experiments demonstrate that LLMs achieve strong alignment with human-annotated cognitive complexity scores and show significant promise for pre-assessing item difficulty. However, LLMs exhibit marked limitations in metacognitive awareness—i.e., reliably identifying and articulating their own reasoning steps. This work establishes a novel paradigm for intelligent educational assessment, highlighting both the unique utility of LLMs in cognitive modeling and their current theoretical and practical boundaries.

Technology Category

Application Category

📝 Abstract
Estimating the cognitive complexity of reading comprehension (RC) items is crucial for assessing item difficulty before it is administered to learners. Unlike syntactic and semantic features, such as passage length or semantic similarity between options, cognitive features that arise during answer reasoning are not readily extractable using existing NLP tools and have traditionally relied on human annotation. In this study, we examine whether large language models (LLMs) can estimate the cognitive complexity of RC items by focusing on two dimensions-Evidence Scope and Transformation Level-that indicate the degree of cognitive burden involved in reasoning about the answer. Our experimental results demonstrate that LLMs can approximate the cognitive complexity of items, indicating their potential as tools for prior difficulty analysis. Further analysis reveals a gap between LLMs' reasoning ability and their metacognitive awareness: even when they produce correct answers, they sometimes fail to correctly identify the features underlying their own reasoning process.
Problem

Research questions and friction points this paper is trying to address.

Estimating cognitive complexity of reading comprehension items
Assessing cognitive burden in answer reasoning process
Evaluating LLMs' metacognitive awareness of reasoning features
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLMs estimate cognitive complexity of reading items
Focus on Evidence Scope and Transformation Level dimensions
Potential tool for prior difficulty analysis despite limitations
🔎 Similar Papers
No similar papers found.
S
Seonjeong Hwang
Graduate School of Artificial Intelligence, POSTECH, Republic of Korea
Hyounghun Kim
Hyounghun Kim
POSTECH
NLPMultimodal Learning
G
Gary Geunbae Lee
Graduate School of Artificial Intelligence, POSTECH, Republic of Korea; Department of Computer Science and Engineering, POSTECH, Republic of Korea