An Initial Exploration of Default Images in Text-to-Image Generation

📅 2025-05-14

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This paper systematically defines and empirically investigates the “default image” phenomenon in text-to-image (TTI) generation—where models produce highly similar, redundant outputs for unknown, semantically ambiguous, or neologistic prompts. Conducting black-box interaction experiments on Midjourney, we employ manually crafted invalid prompts, quantitative image similarity analysis (e.g., CLIP-based embedding distance), and a structured user survey with statistical hypothesis testing (t-tests, ANOVA). Our key contributions are: (1) a reproducible methodology to trigger default images; (2) empirical confirmation of their high cross-prompt consistency; (3) statistically significant evidence that default images substantially degrade user-perceived credibility and practical utility of generated outputs (p < 0.01); and (4) establishment of a novel benchmark and research agenda for prompt robustness, model interpretability, and controllable generation in TTI systems.

Technology Category

Application Category

📝 Abstract

In the creative practice of text-to-image generation (TTI), images are generated from text prompts. However, TTI models are trained to always yield an output, even if the prompt contains unknown terms. In this case, the model may generate what we call"default images": images that closely resemble each other across many unrelated prompts. We argue studying default images is valuable for designing better solutions for TTI and prompt engineering. In this paper, we provide the first investigation into default images on Midjourney, a popular image generator. We describe our systematic approach to create input prompts triggering default images, and present the results of our initial experiments and several small-scale ablation studies. We also report on a survey study investigating how default images affect user satisfaction. Our work lays the foundation for understanding default images in TTI and highlights challenges and future research directions.

Problem

Research questions and friction points this paper is trying to address.

Investigating default images in text-to-image generation models

Exploring prompt engineering to trigger and analyze default images

Assessing user satisfaction impact caused by default images

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic approach to trigger default images

Initial experiments and ablation studies conducted

Survey study on default images' user impact

🔎 Similar Papers

No similar papers found.

Authors to Follow