Illusion3D: 3D Multiview Illusion with 2D Diffusion Priors

📅 2024-12-12

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses two key limitations in visual illusion generation: (1) conventional shadow/line art supports only simplistic 2D interpretations, and (2) existing diffusion models are restricted to single-view 2D illusions. To this end, we propose the first generative framework for multi-view 3D illusory content. Our method pioneers the transfer of 2D diffusion priors to 3D neural representations—specifically NeRF and 3D Gaussian Splatting—via joint optimization of differentiable rendering, pre-trained text-to-image diffusion models, and geometry-texture parameters. This enables text- or image-conditioned synthesis of 3D illusions with consistent multi-view coherence. Our core contribution is a framework that allows a single 3D model to stably exhibit semantically distinct, structurally complex interpretations (e.g., letters, symbols, or concrete objects) across physically plausible rendering viewpoints—thereby substantially expanding both the dimensional expressiveness and controllability of generative visual illusions.

Technology Category

Application Category

📝 Abstract

Automatically generating multiview illusions is a compelling challenge, where a single piece of visual content offers distinct interpretations from different viewing perspectives. Traditional methods, such as shadow art and wire art, create interesting 3D illusions but are limited to simple visual outputs (i.e., figure-ground or line drawing), restricting their artistic expressiveness and practical versatility. Recent diffusion-based illusion generation methods can generate more intricate designs but are confined to 2D images. In this work, we present a simple yet effective approach for creating 3D multiview illusions based on user-provided text prompts or images. Our method leverages a pre-trained text-to-image diffusion model to optimize the textures and geometry of neural 3D representations through differentiable rendering. When viewed from multiple angles, this produces different interpretations. We develop several techniques to improve the quality of the generated 3D multiview illusions. We demonstrate the effectiveness of our approach through extensive experiments and showcase illusion generation with diverse 3D forms.

Problem

Research questions and friction points this paper is trying to address.

Automatically generating multiview 3D illusions from 2D inputs

Overcoming limitations of traditional simple visual output methods

Extending diffusion-based illusion generation from 2D to 3D

Innovation

Methods, ideas, or system contributions that make the work stand out.

Leverages pre-trained 2D diffusion model

Optimizes neural 3D textures and geometry

Uses differentiable rendering for multiview illusions

🔎 Similar Papers

Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion