تقرير
Generalizing Back-Translation in Neural Machine Translation
العنوان: | Generalizing Back-Translation in Neural Machine Translation |
---|---|
المؤلفون: | Graça, Miguel, Kim, Yunsu, Schamper, Julian, Khadivi, Shahram, Ney, Hermann |
سنة النشر: | 2019 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Computation and Language, Computer Science - Machine Learning |
الوصف: | Back-translation - data augmentation by translating target monolingual data - is a crucial component in modern neural machine translation (NMT). In this work, we reformulate back-translation in the scope of cross-entropy optimization of an NMT model, clarifying its underlying mathematical assumptions and approximations beyond its heuristic usage. Our formulation covers broader synthetic data generation schemes, including sampling from a target-to-source NMT model. With this formulation, we point out fundamental problems of the sampling-based approaches and propose to remedy them by (i) disabling label smoothing for the target-to-source model and (ii) sampling from a restricted search space. Our statements are investigated on the WMT 2018 German - English news translation task. Comment: 4th Conference on Machine Translation (WMT 2019) camera-ready |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/1906.07286 |
رقم الأكسشن: | edsarx.1906.07286 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |