Controllable Satellite-to-Street-View Synthesis with Precise Pose Alignment and Zero-Shot Environmental Control

📅 2025-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address two key challenges in satellite-to-street-view image generation—imprecise pose alignment and uncontrollable environmental conditions (e.g., illumination, weather)—this paper proposes a geometry-semantic co-guided controllable diffusion framework. Methodologically: (1) an Iterative Homography Adjustment (IHA) module is introduced to explicitly model geometric constraints and enhance spatial consistency; (2) a novel text-guided zero-shot illumination/weather control mechanism is developed, leveraging a CLIP text encoder to drive a conditional sampling scheduler—requiring no paired training data. Experiments demonstrate a 12.6% improvement in pose alignment accuracy (mAP), alongside high-fidelity, diverse street-view synthesis across multiple lighting and weather conditions. The framework establishes new state-of-the-art performance in both visual realism and controllability.

Technology Category

Application Category

📝 Abstract
Generating street-view images from satellite imagery is a challenging task, particularly in maintaining accurate pose alignment and incorporating diverse environmental conditions. While diffusion models have shown promise in generative tasks, their ability to maintain strict pose alignment throughout the diffusion process is limited. In this paper, we propose a novel Iterative Homography Adjustment (IHA) scheme applied during the denoising process, which effectively addresses pose misalignment and ensures spatial consistency in the generated street-view images. Additionally, currently, available datasets for satellite-to-street-view generation are limited in their diversity of illumination and weather conditions, thereby restricting the generalizability of the generated outputs. To mitigate this, we introduce a text-guided illumination and weather-controlled sampling strategy that enables fine-grained control over the environmental factors. Extensive quantitative and qualitative evaluations demonstrate that our approach significantly improves pose accuracy and enhances the diversity and realism of generated street-view images, setting a new benchmark for satellite-to-street-view generation tasks.
Problem

Research questions and friction points this paper is trying to address.

Ensures precise satellite-to-street-view pose alignment.
Incorporates diverse environmental conditions in image synthesis.
Improves realism and diversity in street-view image generation.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative Homography Adjustment scheme
Text-guided environmental control
Enhanced pose alignment accuracy
🔎 Similar Papers
No similar papers found.