A Modular Unsupervised Framework for Attribute Recognition from Unstructured Text

📅 2025-07-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses unsupervised structured attribute extraction from unstructured text without task-specific fine-tuning. We propose POSID, a lightweight, modular, and fully unsupervised framework that integrates lexical matching with semantic similarity computation to enable sentence-level attribute association across domains—requiring neither labeled data nor domain adaptation. Crucially, POSID decouples attribute discovery from attribute binding, supporting flexible extension and plug-and-play deployment. All components rely solely on off-the-shelf pretrained language models. Experiments on the InciText missing-person dataset demonstrate that POSID significantly outperforms existing unsupervised methods in both precision and cross-domain generalization, validating its practical utility and robustness in real-world scenarios.

Technology Category

Application Category

📝 Abstract
We propose POSID, a modular, lightweight and on-demand framework for extracting structured attribute-based properties from unstructured text without task-specific fine-tuning. While the method is designed to be adaptable across domains, in this work, we evaluate it on human attribute recognition in incident reports. POSID combines lexical and semantic similarity techniques to identify relevant sentences and extract attributes. We demonstrate its effectiveness on a missing person use case using the InciText dataset, achieving effective attribute extraction without supervised training.
Problem

Research questions and friction points this paper is trying to address.

Extract structured attributes from unstructured text
Adaptable across domains without fine-tuning
Combine lexical and semantic similarity techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular unsupervised framework for attribute recognition
Combines lexical and semantic similarity techniques
Lightweight on-demand extraction without fine-tuning
🔎 Similar Papers
No similar papers found.