MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World

📅 2025-04-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Diffusion models struggle to enforce physical consistency among shadows, occlusions, and reflections in specular reflection generation, resulting in poor generalization. To address this, we propose a synthetic data augmentation strategy specifically designed for mirror-like reflections—incorporating random pose sampling, object pairing, and ground-contact constraints—alongside a three-stage curriculum learning framework that explicitly models spatial relationships and optical principles. Our method integrates object-level paired sampling, geometry-aware spatial modeling, and progressive learning to achieve high-fidelity, physically consistent reflection synthesis under multi-object scenes, complex occlusions, and arbitrary viewpoints. Quantitative and qualitative evaluations demonstrate significant improvements over existing state-of-the-art methods, with enhanced cross-pose generalization and superior adaptability to real-world scenarios.

Technology Category

Application Category

📝 Abstract
Diffusion models have become central to various image editing tasks, yet they often fail to fully adhere to physical laws, particularly with effects like shadows, reflections, and occlusions. In this work, we address the challenge of generating photorealistic mirror reflections using diffusion-based generative models. Despite extensive training data, existing diffusion models frequently overlook the nuanced details crucial to authentic mirror reflections. Recent approaches have attempted to resolve this by creating synhetic datasets and framing reflection generation as an inpainting task; however, they struggle to generalize across different object orientations and positions relative to the mirror. Our method overcomes these limitations by introducing key augmentations into the synthetic data pipeline: (1) random object positioning, (2) randomized rotations, and (3) grounding of objects, significantly enhancing generalization across poses and placements. To further address spatial relationships and occlusions in scenes with multiple objects, we implement a strategy to pair objects during dataset generation, resulting in a dataset robust enough to handle these complex scenarios. Achieving generalization to real-world scenes remains a challenge, so we introduce a three-stage training curriculum to develop the MirrorFusion 2.0 model to improve real-world performance. We provide extensive qualitative and quantitative evaluations to support our approach. The project page is available at: https://mirror-verse.github.io/.
Problem

Research questions and friction points this paper is trying to address.

Generating photorealistic mirror reflections using diffusion models
Improving generalization across object orientations and positions
Addressing spatial relationships in scenes with multiple objects
Innovation

Methods, ideas, or system contributions that make the work stand out.

Random object positioning enhances generalization
Randomized rotations improve reflection realism
Three-stage training boosts real-world performance