Can Constructions "SCAN" Compositionality ?

📅 2025-09-24

📈 Citations: 0

✨ Influential: 0

career value

154K/year

🤖 AI Summary

Sequence-to-sequence models exhibit significant deficiencies in compositional and systematic generalization—particularly under out-of-distribution conditions. To address this, we propose an unsupervised pseudo-construction mining method that requires no architectural modifications or additional annotations; instead, it automatically extracts variable-slot templates from training data to explicitly model form-meaning pairings and enhance structural recombination capability. Our approach integrates pseudo-constructions into the preprocessing pipeline of the SCAN dataset, thereby improving data efficiency and generalization robustness. Experiments demonstrate state-of-the-art performance on the highly challenging ADD JUMP and AROUND RIGHT splits, achieving accuracies of 47.8% and 20.3%, respectively—surpassing most supervised baselines using only 40% of the training data. This work constitutes the first fully unsupervised, construction-level representation mining framework, establishing a novel paradigm for systematic generalization in low-resource settings.

Technology Category

Application Category

📝 Abstract

Sequence to Sequence models struggle at compositionality and systematic generalisation even while they excel at many other tasks. We attribute this limitation to their failure to internalise constructions conventionalised form meaning pairings that license productive recombination. Building on these insights, we introduce an unsupervised procedure for mining pseudo-constructions: variable-slot templates automatically extracted from training data. When applied to the SCAN dataset, our method yields large gains out-of-distribution splits: accuracy rises to 47.8 %on ADD JUMP and to 20.3% on AROUND RIGHT without any architectural changes or additional supervision. The model also attains competitive performance with? 40% of the original training data, demonstrating strong data efAciency. Our findings highlight the promise of construction-aware preprocessing as an alternative to heavy architectural or training-regime interventions.

Problem

Research questions and friction points this paper is trying to address.

Sequence to Sequence models fail at compositionality and systematic generalization

Models lack internalization of form-meaning pairings for productive recombination

Unsupervised method mines pseudo-constructions to improve out-of-distribution performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unsupervised mining of pseudo-constructions from training data

Variable-slot templates extracted automatically without supervision

Construction-aware preprocessing as alternative to architectural changes

🔎 Similar Papers

Geometric Signatures of Compositionality Across a Language Model's Lifetime