A new semantically annotated corpus with syntactic-semantic and cross-lingual senses

📅 2026-05-27
📈 Citations: 0
Influential: 0
📄 PDF

career value

122K/year
🤖 AI Summary
This study addresses the scarcity of fine-grained, cross-lingual annotated resources that integrate syntactic and semantic information for word sense disambiguation of French polysemous verbs. To bridge this gap, the authors construct a novel corpus covering 20 high-frequency polysemous verbs. By aligning actual translations from English parallel texts with lexical entries from the French Lexicon-Grammar dictionary, they propose and implement a tripartite joint annotation framework—comprising translation alignment labels, lexicon-grammar entry labels, and derived fine-grained semantic labels—for the first time. This resource substantially enhances data support and annotation granularity for word sense disambiguation, offering an innovative foundation for multilingual, multidimensional lexical semantic representation.
📝 Abstract
We describe a new sense-tagged corpus for word sense disambiguation. The corpus is constituted of instances of 20 French polysemous verbs. Each verb instance is annotated with three sense labels: (1) the actual translation of the verb in the english version of this instance in a parallel corpus, (2) an entry of the verb in a computational dictionary of French (the Lexicon-Grammar tables) and (3) a fine-grained sense label resulting from the concatenation of the translation and the Lexicon-Grammar entry.
Problem

Research questions and friction points this paper is trying to address.

word sense disambiguation
semantically annotated corpus
polysemous verbs
cross-lingual senses
syntactic-semantic annotation
Innovation

Methods, ideas, or system contributions that make the work stand out.

word sense disambiguation
semantically annotated corpus
cross-lingual sense
Lexicon-Grammar
polysemous verbs
🔎 Similar Papers
No similar papers found.