Beyond Absolute Scores: Relative Edit-induced Difference for Generalizable Image Aesthetic Assessment

📅 2026-06-04

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

This work addresses the limitations of existing image aesthetic assessment methods, which rely on absolute mean opinion scores and thus fail to capture the relative and dynamically comparative nature of human aesthetic judgments, leading to constrained generalization. To overcome this, the authors propose RED-Aes, a novel framework that introduces an edit-induced relative aesthetic difference learning paradigm. It leverages controllable image editing models to generate semantically consistent image pairs, constructs the RED-20k dataset annotated with aesthetic differentials and chain-of-thought rationales, and employs a three-stage training strategy requiring only relative supervision. The method achieves state-of-the-art performance across multiple public benchmarks and significantly enhances the model’s generalization capability for cross-scenario aesthetic evaluation.

📝 Abstract

Traditional Image Aesthetic Assessment (IAA) methods mainly rely on regressing absolute Mean Opinion Scores (MOS). However, such a paradigm overlooks the inherently dynamic nature of human aesthetic perception, which relies on subconscious comparison against implicit visual references. Consequently, the lack of causal reasoning regarding aesthetic differences prevents models from learning generalizable aesthetic principles, thus limiting their generalization across diverse scenarios. In this work, we rethink the IAA task and propose Relative Edit-induced Difference Aesthetic learning (RED-Aes), a novel framework that leverages controllable image editing models to simulate the human aesthetic reasoning process. Instead of fitting absolute score distributions, RED-Aes explicitly learns the visual factors that drive aesthetic changes. To support this paradigm, we construct the RED-20k dataset, which comprises editing-based image pairs, quantitative aesthetic differences, and Chain-of-Thought (CoT) reasoning. Furthermore, we introduce a three-stage training strategy guided by a relative ranking consistency reward, optimizing the model solely via relative supervision. Extensive experiments demonstrate that RED-Aes achieves state-of-the-art performance on multiple public benchmarks, exhibiting superior generalization capabilities.

Problem

Research questions and friction points this paper is trying to address.

Image Aesthetic Assessment

generalization

relative comparison

aesthetic perception

absolute scores

Innovation

Methods, ideas, or system contributions that make the work stand out.

Relative Aesthetic Assessment

Image Editing

Generalizable Aesthetics