AnimatePainter: A Self-Supervised Rendering Framework for Reconstructing Painting Process

📅 2025-03-21

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the limitation of existing painting process generation methods, which rely heavily on task-specific datasets and labor-intensive manual annotations, thereby hindering generalization to arbitrary input images. To overcome this, we propose a fully self-supervised framework that requires no ground-truth drawing sequences. Methodologically, we introduce a novel self-supervised video dataset construction pipeline leveraging depth estimation and differentiable stroke rendering. We design a dedicated fusion layer that explicitly models two fundamental human drawing behaviors—refinement and layering—and integrate it into a video diffusion architecture to enable temporally coherent, reverse stroke-generation. Experiments demonstrate that our approach produces high-fidelity, temporally consistent, human-like painting videos across diverse image categories, significantly outperforming supervised baselines. To our knowledge, this is the first method capable of universal painting process generation without any manually annotated drawing data.

Technology Category

Application Category

📝 Abstract

Humans can intuitively decompose an image into a sequence of strokes to create a painting, yet existing methods for generating drawing processes are limited to specific data types and often rely on expensive human-annotated datasets. We propose a novel self-supervised framework for generating drawing processes from any type of image, treating the task as a video generation problem. Our approach reverses the drawing process by progressively removing strokes from a reference image, simulating a human-like creation sequence. Crucially, our method does not require costly datasets of real human drawing processes; instead, we leverage depth estimation and stroke rendering to construct a self-supervised dataset. We model human drawings as"refinement"and"layering"processes and introduce depth fusion layers to enable video generation models to learn and replicate human drawing behavior. Extensive experiments validate the effectiveness of our approach, demonstrating its ability to generate realistic drawings without the need for real drawing process data.

Problem

Research questions and friction points this paper is trying to address.

Reconstructs painting process without human-annotated datasets

Generates drawing sequences from any image type

Simulates human-like stroke removal for self-supervised learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised framework for drawing process generation

Reverses drawing process via stroke removal simulation

Uses depth fusion layers to model human drawing behavior

🔎 Similar Papers

No similar papers found.

Authors to Follow