DiTTo: Scalable Order-aware All-in-One Image Restoration Agent

📅 2026-05-29

📈 Citations: 0

✨ Influential: 0

career value

185K/year

🤖 AI Summary

Real-world images often suffer from multiple concurrent degradations, and the order in which these degradations are removed significantly affects restoration performance. Existing agent-based methods for constructing optimal restoration trajectories are computationally expensive and struggle to generalize to newly introduced restoration experts. To address this, this work proposes the DiTTo framework, comprising a DiTTo simulator that efficiently generates trajectory data using single-step restoration from ∪S-IR and quality prediction from AiO-IQA, and a DiTTo agent capable of decoupled learning of degradation identification, action sequencing, and output formatting. The core innovation is the Order-aware Restoration Alignment (ORA) mechanism, which enables lightweight, plug-and-play extensibility—integrating a new expert requires updating only the ORA module. Experiments demonstrate that DiTTo achieves superior restoration quality over existing agent-based approaches on MiO-100 with up to five concurrent degradations.

📝 Abstract

Real-world images rarely suffer from a single degradation, and the order in which degradations are removed substantially affects the final restoration quality, motivating agent-based image restoration (IR), where a vision-language model schedules a pool of pre-built restoration-experts. However, existing training-based agents require $\mathcal{O}((N^{\mathbf{D}})^{2})$ restoration-expert calls per image to construct the Optimal Restoration-action Trajectory Dataset (ORTD), where $N^{\mathbf{D}}$ denotes the number of degradation types in the universe $\mathbf{D}$, and couple agent training to a fixed restoration-expert pool, preventing extension to newly introduced restoration-experts without full retraining. To overcome these efficiency and extensibility bottlenecks, we propose \textbf{DiTTo}, a novel order-aware image restoration agent framework consisting of the DiTTo Simulator and the DiTTo Agent. The DiTTo Simulator combines $\cup$S-IR for single-step restoration-action simulation and AiO-IQA for per-action quality prediction, reducing ORTD construction to $\mathcal{O}(N^{\mathbf{D}})$ simulator calls per image; the DiTTo Agent is trained by SFT on the simulator-generated ORTD, followed by \textbf{Order-aware Restoration Alignment (ORA)} that aligns degradation identification, restoration-action-ordering, and output format along independent axes. This enables \textbf{plug-and-play scalable extensibility}: adding a new restoration-expert requires updating only the lightweight ORA stage. On the MiO-100 evaluation set with up to five concurrent degradations, our DiTTo Agent achieves state-of-the-art multi-degradation restoration quality among previous agent-based IR methods.

Problem

Research questions and friction points this paper is trying to address.

image restoration

agent-based IR

restoration-expert extensibility

optimal restoration-action trajectory

multi-degradation

Innovation

Methods, ideas, or system contributions that make the work stand out.

order-aware restoration

scalable agent

image restoration