Semantically Consistent Person Image Generation

📅 2023-02-28

🏛️ International Conference on Pattern Recognition

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

To address semantic misalignment among pose, clothing, and identity attributes under cross-view or cross-motion conditions in portrait generation, this paper proposes a semantic-decoupled dual-stream conditional diffusion model. It explicitly embeds fine-grained attribute constraints into the denoising process for the first time: a pose-aware graph convolutional encoder captures structural priors, while a semantic attention module ensures attribute-layout alignment; an attribute-aware reweighting loss further enforces structural-semantic consistency. Evaluated on DeepFashion and Market-1501, the method reduces FID by 37% and improves human-rated semantic fidelity by 2.1× over prior work. It enables high-fidelity, fine-grained controllable portrait editing with precise attribute manipulation.

Problem

Research questions and friction points this paper is trying to address.

Generate context-aware person images

Blend synthesized instances into complex scenes

Ensure semantic coherence without altering global context

Innovation

Methods, ideas, or system contributions that make the work stand out.

Pix2PixHD model

data-centric approach

multi-scale attention-guided architecture

🔎 Similar Papers

No similar papers found.