🤖 AI Summary
Existing art-related visual datasets suffer from central bias, insufficient fine-grained categories, limited capacity to represent abstract concepts and peripheral objects—hindering artwork detection and visual cultural heritage analysis. To address these limitations, we introduce ArtScent, the first object detection dataset explicitly designed for olfactory perception in artworks. It comprises 4,712 images with 38,116 instance-level annotations across 139 fine-grained, odor-associated categories. We propose a novel cross-modal (vision–olfaction) semantic alignment paradigm for fine-grained annotation, uncovering spatial distribution patterns of olfactory cues in artistic composition. For highly occluded, dense scenes, we integrate human-curated precise segmentation with statistical modeling to achieve robust instance segmentation. Comprehensive benchmarking demonstrates that ArtScent significantly increases detection difficulty in complex artistic contexts. It establishes a new standard and presents fresh challenges for interdisciplinary research at the intersection of visual culture and multimodal cognition.
📝 Abstract
Real-world applications of computer vision in the humanities require algorithms to be robust against artistic abstraction, peripheral objects, and subtle differences between fine-grained target classes. Existing datasets provide instance-level annotations on artworks but are generally biased towards the image centre and limited with regard to detailed object classes. The proposed ODOR dataset fills this gap, offering 38,116 object-level annotations across 4712 images, spanning an extensive set of 139 fine-grained categories. Conducting a statistical analysis, we showcase challenging dataset properties, such as a detailed set of categories, dense and overlapping objects, and spatial distribution over the whole image canvas. Furthermore, we provide an extensive baseline analysis for object detection models and highlight the challenging properties of the dataset through a set of secondary studies. Inspiring further research on artwork object detection and broader visual cultural heritage studies, the dataset challenges researchers to explore the intersection of object recognition and smell perception.