EasyTune: Efficient Step-Aware Fine-Tuning for Diffusion-Based Motion Generation

📅 2026-02-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges of efficiently aligning diffusion-based motion generation with downstream objectives, which are hindered by coarse-grained optimization, high memory overhead, and scarcity of preference data. We propose EasyTune, which for the first time identifies the recursive dependency in denoising trajectories as a key optimization bottleneck. By enabling step-wise independent fine-tuning, EasyTune achieves fine-grained alignment with substantially reduced memory consumption. Additionally, we introduce a Self-Refined Preference Learning (SPL) mechanism to mitigate the limited availability of human preference data. Experimental results demonstrate that EasyTune improves alignment performance by 8.2% over DRaFT-50 on the MM-Dist metric, while requiring only 31.16% of its additional memory and accelerating training by 7.3×.

Technology Category

Application Category

📝 Abstract
In recent years, motion generative models have undergone significant advancement, yet pose challenges in aligning with downstream objectives. Recent studies have shown that using differentiable rewards to directly align the preference of diffusion models yields promising results. However, these methods suffer from (1) inefficient and coarse-grained optimization with (2) high memory consumption. In this work, we first theoretically and empirically identify the key reason of these limitations: the recursive dependence between different steps in the denoising trajectory. Inspired by this insight, we propose EasyTune, which fine-tunes diffusion at each denoising step rather than over the entire trajectory. This decouples the recursive dependence, allowing us to perform (1) a dense and fine-grained, and (2) memory-efficient optimization. Furthermore, the scarcity of preference motion pairs restricts the availability of motion reward model training. To this end, we further introduce a Self-refinement Preference Learning (SPL) mechanism that dynamically identifies preference pairs and conducts preference learning. Extensive experiments demonstrate that EasyTune outperforms DRaFT-50 by 8.2% in alignment (MM-Dist) improvement while requiring only 31.16% of its additional memory overhead and achieving a 7.3x training speedup. The project page is available at this link {https://xiaofeng-tan.github.io/projects/EasyTune/index.html}.
Problem

Research questions and friction points this paper is trying to address.

diffusion-based motion generation
preference alignment
memory efficiency
fine-grained optimization
preference data scarcity
Innovation

Methods, ideas, or system contributions that make the work stand out.

step-aware fine-tuning
diffusion-based motion generation
recursive dependence decoupling
self-refinement preference learning
memory-efficient optimization
🔎 Similar Papers
No similar papers found.
Xiaofeng Tan
Xiaofeng Tan
Research Intern at Tencent; Master at Southeast University; Dual BSc at Shenzhen Unversity.
AIGCRLHF
W
Wanjiang Weng
Southeast University
H
Haodong Lei
Southeast University
H
Hongsong Wang
Southeast University