PromptEvolver: Prompt Inversion through Evolutionary Optimization in Natural-Language Space

📅 2026-04-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing prompt inversion methods in text-to-image generation, which often yield unnatural and semantically obscure prompts, resulting in low-fidelity image reconstructions and poor controllability. The authors propose a novel framework that introduces genetic algorithms into the natural language prompt space, enabling prompt optimization using only black-box access to the image generator. By leveraging a vision-language model to guide evolutionary search, the method automatically produces high-quality prompts that are both semantically coherent and human-readable, without requiring any internal knowledge of the generative model. Evaluated across multiple prompt inversion benchmarks, the approach significantly outperforms current state-of-the-art techniques, achieving superior image reconstruction fidelity while enhancing prompt interpretability.
📝 Abstract
Text-to-image generation has progressed rapidly, but faithfully generating complex scenes requires extensive trial-and-error to find the exact prompt. In the prompt inversion task, the goal is to recover a textual prompt that can faithfully reconstruct a given target image. Currently, existing methods frequently yield suboptimal reconstructions and produce unnatural, hard-to-interpret prompts that hinder transparency and controllability. In this work, we present PromptEvolver, a prompt inversion approach that generates natural-language prompts while achieving high-fidelity reconstructions of the target image. Our method uses a genetic algorithm to optimize the prompt, leveraging a strong vision-language model to guide the evolution process. Importantly, it works on black-box generation models by requiring only image outputs. Finally, we evaluate PromptEvolver across multiple prompt inversion benchmarks and show that it consistently outperforms competing methods.
Problem

Research questions and friction points this paper is trying to address.

prompt inversion
text-to-image generation
natural-language prompts
image reconstruction
prompt interpretability
Innovation

Methods, ideas, or system contributions that make the work stand out.

prompt inversion
evolutionary optimization
natural-language prompts
black-box text-to-image generation
vision-language model
🔎 Similar Papers
No similar papers found.