Structured Language Generation Model for Robust Structure Prediction

📅 2024-02-14
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF

career value

203K/year
🤖 AI Summary
While generative large language models (LLMs) excel at open-ended text generation, they underperform significantly—relative to similarly sized encoder-only models—on structured prediction tasks such as named entity recognition and relation extraction, primarily due to misalignment between their internal linguistic representations and the supervised fine-tuning output space. Method: We propose a dataset-agnostic, general-purpose structured prediction framework that systematically reformulates sequence-to-sequence modeling as a classification task, integrating loss calibration and structured decoding to unify generative language modeling with classification-based sequence modeling. Contribution/Results: Our approach matches or approaches the performance of task-specific fine-tuning across diverse structured prediction benchmarks, while substantially improving out-of-distribution generalization robustness. It eliminates the need for dataset-specific architectural or training modifications, offering a viable, unified alternative to conventional custom fine-tuning paradigms.

Technology Category

Application Category

📝 Abstract
Previous work in structured prediction (e.g. NER, information extraction) using single model make use of explicit dataset information, which helps boost in-distribution performance but is orthogonal to robust generalization in real-world situations. To overcome this limitation, we propose the Structured Language Generation Model (SLGM), a framework that reduces sequence-to-sequence problems to classification problems via methodologies in loss calibration and decoding method. Our experimental results show that SLGM is able to maintain performance without explicit dataset information, follow and potentially replace dataset-specific fine-tuning.
Problem

Research questions and friction points this paper is trying to address.

Improves structure prediction in language models
Bridges internal structure representation and output space
Enables robust structured tasks without extra parameters
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reformulates structured prediction as classification problem
Uses reinforced input formatting with structural cues
Employs format-aware decoding to constrain generation