🤖 AI Summary
Training image restoration models incurs prohibitive computational costs due to large-scale datasets. To address this, this work pioneers the application of dataset distillation to image restoration, proposing the distribution-aware TripleD framework: (1) it evaluates sample complexity using ViT-derived features and performs dynamic selection via uncertainty-weighted sampling; (2) it introduces a two-stage progressive distillation mechanism to enable curriculum-style training scheduling; (3) it aligns source and target distributions in feature space—not pixel space—using a lightweight CNN for target distribution calibration. TripleD achieves state-of-the-art performance across diverse restoration tasks—including multi-task learning, unified restoration, and 4K ultra-high-resolution recovery—while requiring only a single consumer-grade GPU and completing training within eight hours. This reduces computational overhead by 500× compared to conventional full-dataset training.
📝 Abstract
With the exponential increase in image data, training an image restoration model is laborious. Dataset distillation is a potential solution to this problem, yet current distillation techniques are a blank canvas in the field of image restoration. To fill this gap, we propose the Distribution-aware Dataset Distillation method (TripleD), a new framework that extends the principles of dataset distillation to image restoration. Specifically, TripleD uses a pre-trained vision Transformer to extract features from images for complexity evaluation, and the subset (the number of samples is much smaller than the original training set) is selected based on complexity. The selected subset is then fed through a lightweight CNN that fine-tunes the image distribution to align with the distribution of the original dataset at the feature level. To efficiently condense knowledge, the training is divided into two stages. Early stages focus on simpler, low-complexity samples to build foundational knowledge, while later stages select more complex and uncertain samples as the model matures. Our method achieves promising performance on multiple image restoration tasks, including multi-task image restoration, all-in-one image restoration, and ultra-high-definition image restoration tasks. Note that we can train a state-of-the-art image restoration model on an ultra-high-definition (4K resolution) dataset using only one consumer-grade GPU in less than 8 hours (500 savings in computing resources and immeasurable training time).