🤖 AI Summary
This study addresses the scarcity of fine-grained, cross-lingual annotated resources that integrate syntactic and semantic information for word sense disambiguation of French polysemous verbs. To bridge this gap, the authors construct a novel corpus covering 20 high-frequency polysemous verbs. By aligning actual translations from English parallel texts with lexical entries from the French Lexicon-Grammar dictionary, they propose and implement a tripartite joint annotation framework—comprising translation alignment labels, lexicon-grammar entry labels, and derived fine-grained semantic labels—for the first time. This resource substantially enhances data support and annotation granularity for word sense disambiguation, offering an innovative foundation for multilingual, multidimensional lexical semantic representation.
📝 Abstract
We describe a new sense-tagged corpus for word sense disambiguation. The corpus is constituted of instances of 20 French polysemous verbs. Each verb instance is annotated with three sense labels: (1) the actual translation of the verb in the english version of this instance in a parallel corpus, (2) an entry of the verb in a computational dictionary of French (the Lexicon-Grammar tables) and (3) a fine-grained sense label resulting from the concatenation of the translation and the Lexicon-Grammar entry.