Historical and psycholinguistic perspectives on morphological productivity: A sketch of an integrative approach

📅 2025-05-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the nature of morphological productivity from an interdisciplinary perspective integrating cognitive computation and diachronic linguistics. Addressing the central question—how novel words are understood and generated—it proposes a dual-path methodology: (1) applying discriminative lexicon models (DLMs) for the first time across Finnish, Malay, and English to quantify differences in derivational versus inflectional productivity; and (2) analyzing Thomas Mann’s lifelong reading-and-writing corpus to track diachronic dynamics of derivational word production. Innovatively integrating DLMs with distributional word embeddings, the study reveals that quasi-affixes automatically map onto embedding centroids, and demonstrates that novel-word production probability declines significantly with increasing Euclidean distance from the centroid—confirming a geometric inhibitory effect of embedding space on lexical output. The work establishes the first cross-linguistic computational model of morphological productivity and uncovers systematic input–output asymmetry in individual language use.

Technology Category

Application Category

📝 Abstract
In this study, we approach morphological productivity from two perspectives: a cognitive-computational perspective, and a diachronic perspective zooming in on an actual speaker, Thomas Mann. For developing the first perspective, we make use of a cognitive computational model of the mental lexicon, the discriminative lexicon model. For computational mappings between form and meaning to be productive, in the sense that novel, previously unencountered words, can be understood and produced, there must be systematicities between the form space and the semantic space. If the relation between form and meaning would be truly arbitrary, a model could memorize form and meaning pairings, but there is no way in which the model would be able to generalize to novel test data. For Finnish nominal inflection, Malay derivation, and English compounding, we explore, using the Discriminative Lexicon Model as a computational tool, to trace differences in the degree to which inflectional and word formation patterns are productive. We show that the DLM tends to associate affix-like sublexical units with the centroids of the embeddings of the words with a given affix. For developing the second perspective, we study how the intake and output of one prolific writer, Thomas Mann, changes over time. We show by means of an examination of what Thomas Mann is likely to have read, and what he wrote, that the rate at which Mann produces novel derived words is extremely low. There are far more novel words in his input than in his output. We show that Thomas Mann is less likely to produce a novel derived word with a given suffix the greater the average distance is of the embeddings of all derived words to the corresponding centroid, and discuss the challenges of using speaker-specific embeddings for low-frequency and novel words.
Problem

Research questions and friction points this paper is trying to address.

Explores morphological productivity using cognitive-computational and diachronic perspectives
Investigates systematicities between form and meaning for novel word processing
Analyzes Thomas Mann's writing for patterns in derived word usage
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses cognitive computational model for mental lexicon
Explores Finnish Malay English morphological patterns
Analyzes Thomas Mann's word usage over time
🔎 Similar Papers
No similar papers found.