๐ค AI Summary
Existing real-world image super-resolution methods typically adopt a single restoration objective, struggling to simultaneously satisfy the divergent demands of fidelity and visual aesthetics. This work proposes FoA-SR, which introduces, for the first time, a profile-oriented preference optimization mechanism that explicitly partitions the super-resolution task into distinct โfidelityโ and โaestheticsโ objectives. Within a unified base model, style disentanglement is achieved through a shared candidate pool and independent reward mechanisms. Built upon the FLUX.2 architecture, the method employs a shared SR adapter conditioned on low-resolution latent variables, integrating flow matching and image-space reconstruction losses, followed by LoRA-based fine-tuning for preference alignment. Experiments demonstrate that the fidelity adapter significantly improves reference-based consistency metrics, while the aesthetics adapter effectively enhances no-reference perceptual quality.
๐ Abstract
Real-world image super-resolution (SR) is often designed with a single restoration objective, despite the current capacity of generative models to produce multiple high-quality reconstructions for the same input. In this paper, we argue that the best restoration strategy is subject to the specific restoration profile: a Faithful restoration prioritizes reference consistency, structure preservation, and hallucination suppression, whereas an Aesthetic restoration prioritizes visually pleasing and natural-looking details. We propose FoA-SR, a novel preference optimization approach to real-world SR based on profiles. To achieve this goal, FoA-SR starts with our supervised FLUX.2-based SR adapter (Flux2SR) trained with LR latent conditioning, flow matching, and image-space reconstruction losses for paired LR-to-HR image super-resolution. Following the development of the shared supervised super-resolution adapter, FoA-SR generates a shared stochastic candidate pool for each input image and ranks the same candidates using profile-specific Faithful and Aesthetic rewards to mine winner-loser pairs. These pairs are used to fine-tune separate LoRA adapters while keeping the base model frozen. Experiments on RealSR and DIV2K show that FoA-SR can steer the same SR adapter towards distinct restoration objectives: a Faithful adapter improves reference-consistent metrics while an Aesthetic adapter boosts metrics that measure perceptual quality without reference. Our candidate-pool analysis shows that Faithful and Aesthetic rewards frequently select different winners, and a Hybrid-LoRA ablation shows that collapsing both profiles into one reward yields an implicit compromise rather than explicit profile control.