A Novel Image Similarity Metric for Scene Composition Structure

📅 2025-08-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing image similarity metrics inadequately assess the structural integrity of scene composition (SCS) in generative AI outputs: pixel-wise methods are noise-sensitive, perceptual metrics prioritize aesthetics over geometric consistency, and deep learning–based approaches suffer from high training overhead and limited generalizability. To address this, we propose SCSSIM—a training-free, analytical metric for quantifying SCS fidelity. SCSSIM employs hierarchical cubic segmentation to extract spatial statistical features and explicitly models geometric relationships—including position, scale, and orientation—between foreground objects and background. It thus provides the first quantitative characterization of SCS preservation. Experiments demonstrate that SCSSIM is highly robust to non-compositional perturbations, exhibits strong monotonic response to compositional changes, and significantly outperforms state-of-the-art metrics in structural fidelity evaluation. As a result, SCSSIM offers an interpretable, reliable, and computationally efficient tool for structural assessment of generative models.

Technology Category

Application Category

📝 Abstract
The rapid advancement of generative AI models necessitates novel methods for evaluating image quality that extend beyond human perception. A critical concern for these models is the preservation of an image's underlying Scene Composition Structure (SCS), which defines the geometric relationships among objects and the background, their relative positions, sizes, orientations, etc. Maintaining SCS integrity is paramount for ensuring faithful and structurally accurate GenAI outputs. Traditional image similarity metrics often fall short in assessing SCS. Pixel-level approaches are overly sensitive to minor visual noise, while perception-based metrics prioritize human aesthetic appeal, neither adequately capturing structural fidelity. Furthermore, recent neural-network-based metrics introduce training overheads and potential generalization issues. We introduce the SCS Similarity Index Measure (SCSSIM), a novel, analytical, and training-free metric that quantifies SCS preservation by exploiting statistical measures derived from the Cuboidal hierarchical partitioning of images, robustly capturing non-object-based structural relationships. Our experiments demonstrate SCSSIM's high invariance to non-compositional distortions, accurately reflecting unchanged SCS. Conversely, it shows a strong monotonic decrease for compositional distortions, precisely indicating when SCS has been altered. Compared to existing metrics, SCSSIM exhibits superior properties for structural evaluation, making it an invaluable tool for developing and evaluating generative models, ensuring the integrity of scene composition.
Problem

Research questions and friction points this paper is trying to address.

Evaluating image quality beyond human perception metrics
Preserving Scene Composition Structure in generative AI outputs
Overcoming limitations of traditional and neural-network-based similarity metrics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Novel analytical SCS Similarity Index Measure
Training-free metric using Cuboidal partitioning
Robustly captures non-object structural relationships
🔎 Similar Papers
No similar papers found.
M
Md Redwanul Haque
School of Information Technology, Deakin University, Burwood, Victoria, Australia
M
Manzur Murshed
School of Information Technology, Deakin University, Burwood, Victoria, Australia
M
Manoranjan Paul
School of Computing, Mathematics and Engineering, Charles Sturt University, Bathurst, NSW, Australia
Tsz-Kwan Lee
Tsz-Kwan Lee
Deakin University
computer vision2D/3D and multiview video codingimage/video processingmachine learningvideo quality assessment