PromptSplit: Revealing Prompt-Level Disagreement in Generative Models

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

Existing generative models often exhibit behavioral discrepancies under identical prompts, yet interpretable prompt-level analysis methods remain lacking. This work proposes PromptSplit, a novel framework that introduces the first scalable kernel-based approach: by constructing tensor-product embeddings of prompts and outputs, it computes a kernel covariance matrix and identifies model divergence within the eigensubspace of its weighted difference. Leveraging random projection approximations, the algorithm achieves a reduced complexity of O(nr² + r³). Theoretical analysis provides rigorous error bounds, while experiments across text-to-image generation, text generation, and image captioning tasks demonstrate accurate detection of genuine behavioral differences and precise localization of the critical prompts responsible for model divergence.

Technology Category

Application Category

📝 Abstract

Prompt-guided generative AI models have rapidly expanded across vision and language domains, producing realistic and diverse outputs from textual inputs. The growing variety of such models, trained with different data and architectures, calls for principled methods to identify which types of prompts lead to distinct model behaviors. In this work, we propose PromptSplit, a kernel-based framework for detecting and analyzing prompt-dependent disagreement between generative models. For each compared model pair, PromptSplit constructs a joint prompt--output representation by forming tensor-product embeddings of the prompt and image (or text) features, and then computes the corresponding kernel covariance matrix. We utilize the eigenspace of the weighted difference between these matrices to identify the main directions of behavioral difference across prompts. To ensure scalability, we employ a random-projection approximation that reduces computational complexity to $O(nr^2 + r^3)$ for projection dimension $r$. We further provide a theoretical analysis showing that this approximation yields an eigenstructure estimate whose expected deviation from the full-dimensional result is bounded by $O(1/r^2)$. Experiments across text-to-image, text-to-text, and image-captioning settings demonstrate that PromptSplit accurately detects ground-truth behavioral differences and isolates the prompts responsible, offering an interpretable tool for detecting where generative models disagree.

Problem

Research questions and friction points this paper is trying to address.

prompt-level disagreement

generative models

behavioral difference

model comparison

prompt-dependent behavior

Innovation

Methods, ideas, or system contributions that make the work stand out.

PromptSplit

kernel-based framework

tensor-product embeddings