Integrating Symbolic Natural Language Understanding and Language Models for Word Sense Disambiguation

📅 2025-11-20

📈 Citations: 0

✨ Influential: 0

career value

165K/year

🤖 AI Summary

Word Sense Disambiguation (WSD) faces two key bottlenecks when applied to fine-grained semantic representations for complex reasoning—e.g., OpenCyc: heavy reliance on manually annotated data and poor adaptability of existing methods to the rich symbolic structure of formal knowledge bases. This paper proposes a fully unsupervised, closed-loop WSD framework: a symbolic natural language understanding (NLU) system first generates polysemous candidates, which are reformulated as natural language options; a large language model then serves as a context-aware discriminator to select the most appropriate sense within the given context; finally, the disambiguation result is fed back to refine the symbolic system’s inference. By eliminating dependence on supervised training, the approach enables tight synergy between symbolic logic and statistical modeling. Experiments on standard benchmarks demonstrate substantial improvements in disambiguation accuracy over OpenCyc and other complex knowledge bases, establishing a new paradigm for interpretable and scalable semantic understanding.

Technology Category

Application Category

📝 Abstract

Word sense disambiguation is a fundamental challenge in natural language understanding. Current methods are primarily aimed at coarse-grained representations (e.g. WordNet synsets or FrameNet frames) and require hand-annotated training data to construct. This makes it difficult to automatically disambiguate richer representations (e.g. built on OpenCyc) that are needed for sophisticated inference. We propose a method that uses statistical language models as oracles for disambiguation that does not require any hand-annotation of training data. Instead, the multiple candidate meanings generated by a symbolic NLU system are converted into distinguishable natural language alternatives, which are used to query an LLM to select appropriate interpretations given the linguistic context. The selected meanings are propagated back to the symbolic NLU system. We evaluate our method against human-annotated gold answers to demonstrate its effectiveness.

Problem

Research questions and friction points this paper is trying to address.

Disambiguating word senses without hand-annotated training data

Enabling fine-grained meaning representations for language understanding

Integrating symbolic systems with statistical language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses statistical language models as disambiguation oracles

Converts symbolic meanings into natural language alternatives

Propagates selected meanings back to symbolic system

🔎 Similar Papers

No similar papers found.