Integrating Symbolic Natural Language Understanding and Language Models for Word Sense Disambiguation

๐Ÿ“… 2025-11-20
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF

career value

165K/year
๐Ÿค– AI Summary
Word Sense Disambiguation (WSD) faces two key bottlenecks when applied to fine-grained semantic representations for complex reasoningโ€”e.g., OpenCyc: heavy reliance on manually annotated data and poor adaptability of existing methods to the rich symbolic structure of formal knowledge bases. This paper proposes a fully unsupervised, closed-loop WSD framework: a symbolic natural language understanding (NLU) system first generates polysemous candidates, which are reformulated as natural language options; a large language model then serves as a context-aware discriminator to select the most appropriate sense within the given context; finally, the disambiguation result is fed back to refine the symbolic systemโ€™s inference. By eliminating dependence on supervised training, the approach enables tight synergy between symbolic logic and statistical modeling. Experiments on standard benchmarks demonstrate substantial improvements in disambiguation accuracy over OpenCyc and other complex knowledge bases, establishing a new paradigm for interpretable and scalable semantic understanding.

Technology Category

Application Category

๐Ÿ“ Abstract
Word sense disambiguation is a fundamental challenge in natural language understanding. Current methods are primarily aimed at coarse-grained representations (e.g. WordNet synsets or FrameNet frames) and require hand-annotated training data to construct. This makes it difficult to automatically disambiguate richer representations (e.g. built on OpenCyc) that are needed for sophisticated inference. We propose a method that uses statistical language models as oracles for disambiguation that does not require any hand-annotation of training data. Instead, the multiple candidate meanings generated by a symbolic NLU system are converted into distinguishable natural language alternatives, which are used to query an LLM to select appropriate interpretations given the linguistic context. The selected meanings are propagated back to the symbolic NLU system. We evaluate our method against human-annotated gold answers to demonstrate its effectiveness.
Problem

Research questions and friction points this paper is trying to address.

Disambiguating word senses without hand-annotated training data
Enabling fine-grained meaning representations for language understanding
Integrating symbolic systems with statistical language models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses statistical language models as disambiguation oracles
Converts symbolic meanings into natural language alternatives
Propagates selected meanings back to symbolic system
๐Ÿ”Ž Similar Papers
No similar papers found.