Enhancing Entity Aware Machine Translation with Multi-task Learning

📅 2025-06-23

📈 Citations: 0

✨ Influential: 0

career value

162K/year

🤖 AI Summary

Entity-Aware Machine Translation (EAMT) faces two key challenges: scarcity of entity-specific translation data and difficulty in modeling contextual dependencies for entities. To address these, this paper proposes an end-to-end multi-task learning framework that jointly optimizes Named Entity Recognition (NER) and Neural Machine Translation (NMT), leveraging shared semantic representations to enable coordinated modeling of entity boundary detection and translation. Its main contributions are: (1) an entity-aware attention mechanism that explicitly aligns source-side entities with their target-side translations; and (2) an entity consistency loss that enforces cross-task representation alignment. Evaluated on the SemEval 2025 Task 2 benchmark, the method achieves substantial improvements—+4.2 BLEU-E on entity translation accuracy and +2.8 BLEU on overall translation quality—demonstrating both the effectiveness and generalizability of joint modeling for EAMT.

Technology Category

Application Category

📝 Abstract

Entity-aware machine translation (EAMT) is a complicated task in natural language processing due to not only the shortage of translation data related to the entities needed to translate but also the complexity in the context needed to process while translating those entities. In this paper, we propose a method that applies multi-task learning to optimize the performance of the two subtasks named entity recognition and machine translation, which improves the final performance of the Entity-aware machine translation task. The result and analysis are performed on the dataset provided by the organizer of Task 2 of the SemEval 2025 competition.

Problem

Research questions and friction points this paper is trying to address.

Addressing data shortage in entity-aware machine translation

Optimizing entity recognition and translation via multi-task learning

Improving EAMT performance using SemEval 2025 dataset

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-task learning for entity-aware translation

Combines named entity recognition and translation

Optimizes performance on SemEval 2025 dataset

🔎 Similar Papers

No similar papers found.