Memorization $ eq$ Understanding: Do Large Language Models Have the Ability of Scenario Cognition?

📅 2025-09-05

📈 Citations: 0

✨ Influential: 0

career value

152K/year

🤖 AI Summary

This work investigates the fundamental nature of large language models’ (LLMs) generalization: whether it stems from superficial memorization of training data or genuine semantic understanding—specifically, whether LLMs possess “scene awareness,” i.e., the capacity to accurately bind semantic roles (e.g., agent, patient) to scene elements within context. To this end, the authors propose a dual-perspective evaluation framework integrating external behavioral analysis (scene-based question answering) with internal representation probing. They further introduce the first manually annotated, fiction-oriented scene-semantic annotation dataset. Experiments reveal that state-of-the-art LLMs heavily rely on shallow statistical cues in complex scenes and exhibit marked fragility on core semantic generalization tasks, exposing foundational limitations in their semantic comprehension. This work establishes a novel, reproducible paradigm and benchmark for assessing the cognitive mechanisms underlying LLM behavior.

Technology Category

Application Category

📝 Abstract

Driven by vast and diverse textual data, large language models (LLMs) have demonstrated impressive performance across numerous natural language processing (NLP) tasks. Yet, a critical question persists: does their generalization arise from mere memorization of training data or from deep semantic understanding? To investigate this, we propose a bi-perspective evaluation framework to assess LLMs' scenario cognition - the ability to link semantic scenario elements with their arguments in context. Specifically, we introduce a novel scenario-based dataset comprising diverse textual descriptions of fictional facts, annotated with scenario elements. LLMs are evaluated through their capacity to answer scenario-related questions (model output perspective) and via probing their internal representations for encoded scenario elements-argument associations (internal representation perspective). Our experiments reveal that current LLMs predominantly rely on superficial memorization, failing to achieve robust semantic scenario cognition, even in simple cases. These findings expose critical limitations in LLMs' semantic understanding and offer cognitive insights for advancing their capabilities.

Problem

Research questions and friction points this paper is trying to address.

Assessing LLMs' scenario cognition versus memorization

Evaluating semantic understanding through scenario-based questions

Probing internal representations for argument associations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bi-perspective evaluation framework for scenario cognition

Scenario-based dataset with annotated fictional facts

Probing internal representations for semantic associations

🔎 Similar Papers

Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon