Maximum Bayes Smatch Ensemble Distillation for AMR Parsing

التفاصيل البيبلوغرافية
العنوان: Maximum Bayes Smatch Ensemble Distillation for AMR Parsing
المؤلفون: Lee, Young-Suk, Astudillo, Ramon Fernandez, Hoang, Thanh Lam, Naseem, Tahira, Florian, Radu, Roukos, Salim
المصدر: NAACL-HLT 2022
سنة النشر: 2021
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
الوصف: AMR parsing has experienced an unprecendented increase in performance in the last three years, due to a mixture of effects including architecture improvements and transfer learning. Self-learning techniques have also played a role in pushing performance forward. However, for most recent high performant parsers, the effect of self-learning and silver data augmentation seems to be fading. In this paper we propose to overcome this diminishing returns of silver data by combining Smatch-based ensembling techniques with ensemble distillation. In an extensive experimental setup, we push single model English parser performance to a new state-of-the-art, 85.9 (AMR2.0) and 84.3 (AMR3.0), and return to substantial gains from silver data augmentation. We also attain a new state-of-the-art for cross-lingual AMR parsing for Chinese, German, Italian and Spanish. Finally we explore the impact of the proposed technique on domain adaptation, and show that it can produce gains rivaling those of human annotated data for QALD-9 and achieve a new state-of-the-art for BioAMR.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2112.07790
رقم الأكسشن: edsarx.2112.07790
قاعدة البيانات: arXiv