A Modular Unsupervised Framework for Attribute Recognition from Unstructured Text

📅 2025-07-05

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This paper addresses unsupervised structured attribute extraction from unstructured text without task-specific fine-tuning. We propose POSID, a lightweight, modular, and fully unsupervised framework that integrates lexical matching with semantic similarity computation to enable sentence-level attribute association across domains—requiring neither labeled data nor domain adaptation. Crucially, POSID decouples attribute discovery from attribute binding, supporting flexible extension and plug-and-play deployment. All components rely solely on off-the-shelf pretrained language models. Experiments on the InciText missing-person dataset demonstrate that POSID significantly outperforms existing unsupervised methods in both precision and cross-domain generalization, validating its practical utility and robustness in real-world scenarios.

Technology Category

Application Category

📝 Abstract

We propose POSID, a modular, lightweight and on-demand framework for extracting structured attribute-based properties from unstructured text without task-specific fine-tuning. While the method is designed to be adaptable across domains, in this work, we evaluate it on human attribute recognition in incident reports. POSID combines lexical and semantic similarity techniques to identify relevant sentences and extract attributes. We demonstrate its effectiveness on a missing person use case using the InciText dataset, achieving effective attribute extraction without supervised training.

Problem

Research questions and friction points this paper is trying to address.

Extract structured attributes from unstructured text

Adaptable across domains without fine-tuning

Combine lexical and semantic similarity techniques

Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular unsupervised framework for attribute recognition

Combines lexical and semantic similarity techniques

Lightweight on-demand extraction without fine-tuning

🔎 Similar Papers

No similar papers found.