🤖 AI Summary
Existing LLM-based entity structure extraction methods heavily rely on predefined schemas or annotated data, resulting in incomplete structural coverage. This paper proposes a zero-shot open-domain entity structure discovery framework that requires neither schema priors nor human annotations. Our core innovation is an “Augment–Refine–Unify” collaborative mechanism: leveraging LLM-driven iterative prompting, structured reasoning, and multi-step self-verification to model the mutual reinforcement between entities and their attributes, enabling end-to-end structure discovery. Experiments across three diverse domains demonstrate that our method significantly outperforms supervised baselines in structural completeness while exhibiting strong generalization—requiring no fine-tuning, in-context examples, or domain-specific adaptation.
📝 Abstract
Entity structure extraction, which aims to extract entities and their associated attribute-value structures from text, is an essential task for text understanding and knowledge graph construction. Existing methods based on large language models (LLMs) typically rely heavily on predefined entity attribute schemas or annotated datasets, often leading to incomplete extraction results. To address these challenges, we introduce Zero-Shot Open-schema Entity Structure Discovery (ZOES), a novel approach to entity structure extraction that does not require any schema or annotated samples. ZOES operates via a principled mechanism of enrichment, refinement, and unification, based on the insight that an entity and its associated structure are mutually reinforcing. Experiments demonstrate that ZOES consistently enhances LLMs' ability to extract more complete entity structures across three different domains, showcasing both the effectiveness and generalizability of the method. These findings suggest that such an enrichment, refinement, and unification mechanism may serve as a principled approach to improving the quality of LLM-based entity structure discovery in various scenarios.