Beyond Absolute Scores: Relative Edit-induced Difference for Generalizable Image Aesthetic Assessment

📅 2026-06-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

190K/year
🤖 AI Summary
This work addresses the limitations of existing image aesthetic assessment methods, which rely on absolute mean opinion scores and thus fail to capture the relative and dynamically comparative nature of human aesthetic judgments, leading to constrained generalization. To overcome this, the authors propose RED-Aes, a novel framework that introduces an edit-induced relative aesthetic difference learning paradigm. It leverages controllable image editing models to generate semantically consistent image pairs, constructs the RED-20k dataset annotated with aesthetic differentials and chain-of-thought rationales, and employs a three-stage training strategy requiring only relative supervision. The method achieves state-of-the-art performance across multiple public benchmarks and significantly enhances the model’s generalization capability for cross-scenario aesthetic evaluation.
📝 Abstract
Traditional Image Aesthetic Assessment (IAA) methods mainly rely on regressing absolute Mean Opinion Scores (MOS). However, such a paradigm overlooks the inherently dynamic nature of human aesthetic perception, which relies on subconscious comparison against implicit visual references. Consequently, the lack of causal reasoning regarding aesthetic differences prevents models from learning generalizable aesthetic principles, thus limiting their generalization across diverse scenarios. In this work, we rethink the IAA task and propose Relative Edit-induced Difference Aesthetic learning (RED-Aes), a novel framework that leverages controllable image editing models to simulate the human aesthetic reasoning process. Instead of fitting absolute score distributions, RED-Aes explicitly learns the visual factors that drive aesthetic changes. To support this paradigm, we construct the RED-20k dataset, which comprises editing-based image pairs, quantitative aesthetic differences, and Chain-of-Thought (CoT) reasoning. Furthermore, we introduce a three-stage training strategy guided by a relative ranking consistency reward, optimizing the model solely via relative supervision. Extensive experiments demonstrate that RED-Aes achieves state-of-the-art performance on multiple public benchmarks, exhibiting superior generalization capabilities.
Problem

Research questions and friction points this paper is trying to address.

Image Aesthetic Assessment
generalization
relative comparison
aesthetic perception
absolute scores
Innovation

Methods, ideas, or system contributions that make the work stand out.

Relative Aesthetic Assessment
Image Editing
Generalizable Aesthetics
Chain-of-Thought Reasoning
Relative Supervision
🔎 Similar Papers
No similar papers found.
Q
Qifei Jia
Xiaomi Corporation, Beijing, China
X
Xintong Yao
Xiaomi Corporation, Beijing, China
Minghao Li
Minghao Li
Beihang University
Natural Language Processing
Y
Yajie Chai
Xiaomi Corporation, Beijing, China
Q
Qiming Lu
Xiaomi Corporation, Beijing, China
B
Baoyue Shen
Xiaomi Corporation, Beijing, China
Y
Yasen Zhang
Xiaomi Corporation, Beijing, China
R
Runyu Shi
Xiaomi Corporation, Beijing, China
Y
Ying Huang
Xiaomi Corporation, Beijing, China
Y
Yue Zhang
Xiaomi Corporation, Beijing, China