GroupToM-Bench: Benchmarking Group Theory of Mind and Nonlinear Social Emergence in MLLMs

📅 2026-06-02

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses the limited capacity of current large-scale multimodal language models to comprehend group theory of mind—the ability to model emergent collective behaviors arising from nonlinear interactions among individual mental states. To bridge this gap, we introduce GroupToM-Bench, the first multimodal evaluation benchmark specifically designed for group theory of mind. It features a seven-tier cognitive auditing framework spanning micro-level (individual BDI states), meso-level (group tensions and structural constraints), and macro-level (outcome prediction and mechanistic attribution) reasoning. The benchmark systematically assesses models’ social reasoning through multimodal causal inference tasks. Experimental results demonstrate that existing models significantly underperform human baselines in modeling social structures and understanding nonlinear collective dynamics, thereby validating both the efficacy and challenge posed by GroupToM-Bench.

📝 Abstract

True general intelligence requires not only a model of the physical world but also a social world model: the capacity to infer how individual mental states interact and crystallize into group-level outcomes. Despite notable progress in individual-level Theory of Mind (ToM) reasoning, existing multimodal large language models fail at this broader task. Collective behavior emerges non-linearly from social tensions, conformity dynamics, and structural constraints, meaning it cannot be recovered by merely summing individual intentions. We present GroupToM-Bench, the first multimodal benchmark for group-level ToM, built around a causal chain spanning micro-level BDI states (belief, desire, intention), meso-level group tension and structural constraints, and macro-level outcome prediction and mechanistic attribution. To probe this full arc, we develop a seven-level cognitive audit framework. Experiments reveal a gap between current models and human baselines, highlighting a failure to process social structures and non-linear collective dynamics.

Problem

Research questions and friction points this paper is trying to address.

Group Theory of Mind

Nonlinear Social Emergence

Multimodal Large Language Models

Collective Behavior

Social World Modeling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Group Theory of Mind

Nonlinear Social Emergence

Multimodal Benchmark

Cognitive Audit Framework

Collective Behavior

🔎 Similar Papers

Entering Real Social World! Benchmarking the Social Intelligence of Large Language Models from a First-person Perspective

2024-10-08Citations: 0

MuMA-ToM: Multi-modal Multi-Agent Theory of Mind

2024-08-22arXiv.orgCitations: 0