Sparsely Factored Neural Machine Translation

التفاصيل البيبلوغرافية
العنوان: Sparsely Factored Neural Machine Translation
المؤلفون: Casas, Noe, Fonollosa, Jose A. R., Costa-jussà, Marta R.
سنة النشر: 2021
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
الوصف: The standard approach to incorporate linguistic information to neural machine translation systems consists in maintaining separate vocabularies for each of the annotated features to be incorporated (e.g. POS tags, dependency relation label), embed them, and then aggregate them with each subword in the word they belong to. This approach, however, cannot easily accommodate annotation schemes that are not dense for every word. We propose a method suited for such a case, showing large improvements in out-of-domain data, and comparable quality for the in-domain data. Experiments are performed in morphologically-rich languages like Basque and German, for the case of low-resource scenarios.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2102.08934
رقم الأكسشن: edsarx.2102.08934
قاعدة البيانات: arXiv