🤖 AI Summary
To address the scarcity of parallel corpora and frequent grammatical errors in low-resource neural machine translation (NMT), this paper proposes NSL-MT, a linguistics-informed negative sampling learning framework. During training, NSL-MT constructs a negative space by synthetically generating target-language sentences violating grammatical constraints and introduces a severity-weighted negative sample loss to explicitly penalize illegal syntactic structures. Crucially, NSL-MT requires no additional annotations or external linguistic tools and is plug-and-play compatible with standard NMT architectures. Experiments across multiple low-resource language pairs demonstrate that NSL-MT improves BLEU scores by 3–89% and enhances data efficiency fivefold: using only 1,000 parallel sentence pairs, it matches the performance of baseline models trained on 5,000 pairs. This yields substantially improved grammatical robustness and generalization under few-shot settings.
📝 Abstract
We introduce Negative Space Learning MT (NSL-MT), a training method that teaches models what not to generate by encoding linguistic constraints as severity-weighted penalties in the loss function. NSL-MT increases limited parallel data with synthetically generated violations of target language grammar, explicitly penalizing the model when it assigns high probability to these linguistically invalid outputs. We demonstrate that NSL-MT delivers improvements across all architectures: 3-12% BLEU gains for well-performing models and 56-89% gains for models lacking descent initial support. Furthermore, NSL-MT provides a 5x data efficiency multiplier -- training with 1,000 examples matches or exceeds normal training with 5,000 examples. Thus, NSL-MT provides a data-efficient alternative training method for settings where there is limited annotated parallel corporas.