Task-oriented Learnable Diffusion Timesteps for Universal Few-shot Learning of Dense Tasks

📅 2025-12-29

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

Manual selection of diffusion timesteps in few-shot dense prediction introduces task bias and suboptimal performance. Method: This paper proposes a dual-module framework—Task-aware Timestep Selection (TTS) and Task-aware Feature Composition (TFC)—that for the first time models diffusion timesteps as learnable, task-oriented variables within a denoising diffusion probabilistic model. TTS employs timestep-wise loss-driven selection and cross-timestep feature similarity measurement, while TFC integrates multi-scale features via parameter-efficient adapter-based fine-tuning. Contribution/Results: Evaluated on the large-scale Taskonomy benchmark, our method significantly improves few-shot generalization across diverse dense prediction tasks. It demonstrates strong robustness and consistent cross-task performance on arbitrary unseen tasks, eliminating reliance on heuristic timestep scheduling.

Technology Category

Application Category

📝 Abstract

Denoising diffusion probabilistic models have brought tremendous advances in generative tasks, achieving state-of-the-art performance thus far. Current diffusion model-based applications exploit the power of learned visual representations from multistep forward-backward Markovian processes for single-task prediction tasks by attaching a task-specific decoder. However, the heuristic selection of diffusion timestep features still heavily relies on empirical intuition, often leading to sub-optimal performance biased towards certain tasks. To alleviate this constraint, we investigate the significance of versatile diffusion timestep features by adaptively selecting timesteps best suited for the few-shot dense prediction task, evaluated on an arbitrary unseen task. To this end, we propose two modules: Task-aware Timestep Selection (TTS) to select ideal diffusion timesteps based on timestep-wise losses and similarity scores, and Timestep Feature Consolidation (TFC) to consolidate the selected timestep features to improve the dense predictive performance in a few-shot setting. Accompanied by our parameter-efficient fine-tuning adapter, our framework effectively achieves superiority in dense prediction performance given only a few support queries. We empirically validate our learnable timestep consolidation method on the large-scale challenging Taskonomy dataset for dense prediction, particularly for practical universal and few-shot learning scenarios.

Problem

Research questions and friction points this paper is trying to address.

Adaptively selecting diffusion timesteps for few-shot dense prediction tasks

Addressing sub-optimal performance from heuristic timestep feature selection

Improving dense prediction with few support queries via task-aware modules

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive timestep selection via task-aware loss and similarity

Consolidation of selected timestep features for dense prediction

Parameter-efficient fine-tuning adapter for few-shot learning

🔎 Similar Papers

Caption, Create, Continue: Continual Learning with Pre-trained Generative Vision-Language Models