🤖 AI Summary
This paper investigates the origins of human-like preferences for English dative alternation (direct-object vs. prepositional-object constructions) in language models (LMs): whether such preferences arise from direct syntactic exposure or from more general statistical cues (e.g., length and animacy biases). Using a controlled training paradigm, we systematically ablate local cues (length, animacy) and perturb global word-order statistics to isolate and quantify the independent contributions of direct syntactic evidence versus indirect statistical regularities. Experiments employ iterative training of small-scale LMs on dependency-preserving, synthetically constructed datasets. Results show: (1) local cues significantly influence—but are not necessary for—alternation preferences; (2) even when fully removed, models retain a human-like tendency to front shorter and more animate arguments; (3) global length distributions alone suffice to induce near-human dative alternation preferences. These findings reveal a dual-source mechanism underlying syntactic emergence in LMs: both direct syntactic learning and statistical induction contribute robustly and independently.
📝 Abstract
Language models (LMs) tend to show human-like preferences on a number of syntactic phenomena, but the extent to which these are attributable to direct exposure to the phenomena or more general properties of language is unclear. We explore this with the English dative alternation (DO:"gave Y the X"vs. PO:"gave the X to Y"), using a controlled rearing paradigm wherein we iteratively train small LMs on systematically manipulated input. We focus on properties that affect the choice of alternant: length and animacy. Both properties are directly present in datives but also reflect more global tendencies for shorter elements to precede longer ones and animates to precede inanimates. First, by manipulating and ablating datives for these biases in the input, we show that direct evidence of length and animacy matters, but easy-first preferences persist even without such evidence. Then, using LMs trained on systematically perturbed datasets to manipulate global length effects (re-linearizing sentences globally while preserving dependency structure), we find that dative preferences can emerge from indirect evidence. We conclude that LMs' emergent syntactic preferences come from a mix of direct and indirect sources.