🤖 AI Summary
This study investigates whether vision-language models (VLMs) genuinely comprehend physical conservation laws—a cornerstone of human cognitive development. To this end, we introduce ConserveBench, a novel benchmark comprising 365 controlled cognitive experiments spanning four conservation dimensions: volume, substance, length, and number. Crucially, ConserveBench is the first to distinguish *transformation tasks*—requiring reversible reasoning about dynamic object manipulations—from *non-transformation tasks*, which only demand static quantity judgments. We employ multimodal prompt engineering and zero-shot evaluation, using synthetically generated image–text pairs annotated with cognitive-logical ground truth. Results reveal a striking dissociation between reversibility-based and quantity-based understanding: VLMs achieve 78.3% accuracy on transformation tasks but only 41.9% on non-transformation tasks. This indicates that VLMs capture superficial behavioral patterns of conservation without acquiring deep, compositional semantic representations of quantity—challenging the classical unidimensional developmental account of conservation competence in cognitive psychology.
📝 Abstract
Conservation is a critical milestone of cognitive development considered to be supported by both the understanding of quantitative concepts and the reversibility of operations. To assess whether this critical component of human intelligence has emerged in Vision Language Models, we have curated the ConserveBench, a battery of 365 cognitive experiments across four dimensions of physical quantities: volume, solid quantity, length, and number. The former two involve transformational tasks which require reversibility understanding. The latter two involve non-transformational tasks which assess quantity understanding. Surprisingly, we find that while Vision Language Models are generally good at transformational tasks, they tend to fail at non-transformational tasks. There is a dissociation between understanding the reversibility of operations and understanding of quantity, which both are believed to be the cornerstones of the understanding of law of conservation in humans. $href{https://growing-ai-like-a-child.github.io/pages/Conservation/}{Website}$