Aggregating LLM-Based Weak Verifiers for Spatial Layout Generation

📅 2026-06-03
📈 Citations: 0
Influential: 0
📄 PDF

career value

165K/year
🤖 AI Summary
This work addresses the challenge of accurately verifying whether spatial layouts—such as 3D rooms or 2D posters—conform to natural language descriptions, a task where direct use of large language models (LLMs) yields suboptimal performance. The authors propose a novel approach that leverages an LLM to generate multiple weak verifiers expressed in a domain-specific language (DSL), then aggregates them into a strong verifier using weak supervision techniques. Remarkably, this method requires only around ten human-annotated examples for training. It substantially improves verification accuracy, achieving up to a 7× increase in F1 score across diverse layout tasks. When integrated into layout generation pipelines, the resulting designs receive up to 66.2% higher quality ratings in human evaluations, demonstrating that high-quality feedback can be generated with minimal annotation cost.
📝 Abstract
We present a pipeline for building and aggregating task-specific, LLM-generated weak (imperfect) verifiers into a strong verifier for spatial layout domains. Given a task description, our pipeline asks an LLM to synthesize a collection of verifier programs using a layout verification DSL. Each individual LLM-generated verifier usually provides an imperfect check for a match between the layout and the corresponding task description. We show that by aggregating the responses of many such verifiers we can produce a stronger verifier. Moreover, by applying techniques from weak learning, our pipeline can learn how to aggregate the weak verifiers from a very sparse set of human labeled example layouts (about 10). We find that the strong verifiers produced by our pipeline outperform the status-quo approach of using a set of LLM judges to directly check whether a layout matches a task description, raising F1-scores by up to 7X across a variety of 3D room layout and 2D poster design tasks. We also demonstrate that verifier-guided layout generation using natural language feedback from our strong verifiers improves layout quality of a base layout generator by up to 66.2% according to a human evaluator.
Problem

Research questions and friction points this paper is trying to address.

spatial layout verification
LLM-based verifiers
weak learning
layout generation
natural language feedback
Innovation

Methods, ideas, or system contributions that make the work stand out.

weak verification
LLM-based verifier aggregation
spatial layout generation
layout verification DSL
weak learning
🔎 Similar Papers