تقرير
Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning
العنوان: | Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning |
---|---|
المؤلفون: | Mihaylova, Tsvetomila, Niculae, Vlad, Martins, André F. T. |
سنة النشر: | 2020 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Computation and Language, Computer Science - Machine Learning |
الوصف: | Latent structure models are a powerful tool for modeling language data: they can mitigate the error propagation and annotation bottleneck in pipeline systems, while simultaneously uncovering linguistic insights about the data. One challenge with end-to-end training of these models is the argmax operation, which has null gradient. In this paper, we focus on surrogate gradients, a popular strategy to deal with this problem. We explore latent structure learning through the angle of pulling back the downstream learning objective. In this paradigm, we discover a principled motivation for both the straight-through estimator (STE) as well as the recently-proposed SPIGOT - a variant of STE for structured models. Our perspective leads to new algorithms in the same family. We empirically compare the known and the novel pulled-back estimators against the popular alternatives, yielding new insight for practitioners and revealing intriguing failure cases. Comment: EMNLP 2020 |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/2010.02357 |
رقم الأكسشن: | edsarx.2010.02357 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |