🤖 AI Summary
This study investigates cognitive alignment between humans and large language models (LLMs) in reading comprehension. Method: We propose a semantics-driven graph-structured text representation, extending conventional token-level representations to a global topological structure comprising nodes (semantic units) and edges (semantic relations), integrated with eye-tracking data as a biological proxy for human cognition. Leveraging graph representation learning, LLM-based semantic node extraction, and AI agent-assisted graph construction, we establish the first cross-modal comparison of language understanding at the graph-structural level. Contribution/Results: Experimental results reveal a statistically significant correlation (p < 0.01) between the topological properties of LLM-generated semantic graphs and human eye-fixation distributions, indicating strong alignment in deep semantic organization. This work introduces a novel paradigm for interpretable NLP, human-AI collaborative learning, and cognitive-inspired model design.
📝 Abstract
Reading comprehension is a fundamental skill in human cognitive development. With the advancement of Large Language Models (LLMs), there is a growing need to compare how humans and LLMs understand language across different contexts and apply this understanding to functional tasks such as inference, emotion interpretation, and information retrieval. Our previous work used LLMs and human biomarkers to study the reading comprehension process. The results showed that the biomarkers corresponding to words with high and low relevance to the inference target, as labeled by the LLMs, exhibited distinct patterns, particularly when validated using eye-tracking data. However, focusing solely on individual words limited the depth of understanding, which made the conclusions somewhat simplistic despite their potential significance. This study used an LLM-based AI agent to group words from a reading passage into nodes and edges, forming a graph-based text representation based on semantic meaning and question-oriented prompts. We then compare the distribution of eye fixations on important nodes and edges. Our findings indicate that LLMs exhibit high consistency in language understanding at the level of graph topological structure. These results build on our previous findings and offer insights into effective human-AI co-learning strategies.