A French Corpus Annotated for Multiword Expressions with Adverbial Function

📅 2026-06-03
📈 Citations: 0
Influential: 0
📄 PDF

career value

123K/year
🤖 AI Summary
This study addresses a critical gap in natural language processing research: the scarcity of corpora specifically annotated for adverbial multiword expressions (MWEs). Focusing on French, the work provides the first systematic typology of MWEs that function adverbially and constructs a high-quality annotated corpus by integrating linguistic rules with manual validation, leveraging existing language resources. The resulting dataset, released under the LGPLLR license, constitutes the first publicly available resource dedicated to adverbial MWEs in French. By filling this longstanding void, the corpus offers foundational, reusable data to support downstream tasks such as information retrieval, information extraction, and syntactic parsing.
📝 Abstract
This paper presents a French corpus annotated for multiword expressions (MWEs) with adverbial function. This corpus is designed for investigation on information retrieval and extraction, as well as on deep and shallow syntactic parsing. We delimit which kind of MWEs we annotated, we describe the resources and methods we used for the annotation, and we briefly comment the results. The annotated corpus is available at http://infolingu.univ-mlv.fr/ under the LGPLLR license.
Problem

Research questions and friction points this paper is trying to address.

multiword expressions
adverbial function
French corpus
information retrieval
syntactic parsing
Innovation

Methods, ideas, or system contributions that make the work stand out.

multiword expressions
adverbial function
French corpus
syntactic parsing
annotation methodology
🔎 Similar Papers
No similar papers found.