SHERPA: Seam-aware Harmonized ERP Adaptation for Open-Domain 360$^\circ$ Panorama Generation

📅 2026-06-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing text-to-image diffusion models, trained on planar images, struggle to accommodate the circular topology and polar characteristics of 360° panoramas, often yielding seam artifacts and limited style transfer. This work proposes SHERPA, a lightweight adaptation framework that introduces several key innovations: a frequency-selective Circular RoPE that replaces high-frequency horizontal RoPE with integer-period harmonics while preserving low-frequency priors, toroidal latent encoding and decoding, image-side FFN adapters, and a dual-path training strategy combining a paired geometric path with an unpaired style path, augmented by a self-supervised yaw consistency constraint. Without requiring target images, SHERPA enables seamless, photorealistic, and open-domain text-guided generation of diverse artistic panoramic styles, effectively eliminating seam artifacts inherent in equirectangular projections.
📝 Abstract
Panoramic imagery is increasingly used in world-generation, games, and simulation, where users may need not only photorealistic scenes but also stylized and non-photorealistic environments. Large-scale text-to-image diffusion and flow models provide broad style and semantic priors for this goal, but planar image training misaligns them with the wrap-around topology and polar regions of $360^\circ$ panoramas represented in equirectangular projection (ERP). We present SHERPA, a lightweight adaptation framework that combines frequency-selective Circular RoPE, Circular Latent Encoding/Decoding, image-side FFN adapters, and a Dual-Path Training Scheme. Circular RoPE replaces only the seam-sensitive high-frequency horizontal RoPE band with integer-periodic harmonics while preserving the pretrained lower-frequency spectrum. The Paired Panorama Path supervises geometry, while the Unpaired Style Path uses self-supervised yaw consistency for target-free stylized prompts. As a result, SHERPA generates $360^\circ$ panoramas across both photorealistic panorama domains and open-domain stylized prompts.
Problem

Research questions and friction points this paper is trying to address.

360° panorama
equirectangular projection
seam artifact
open-domain generation
wrap-around topology
Innovation

Methods, ideas, or system contributions that make the work stand out.

Circular RoPE
Dual-Path Training
Equirectangular Projection
Seam-aware Adaptation
Open-Domain Stylization
🔎 Similar Papers