RotRNN: Modelling Long Sequences with Rotations

التفاصيل البيبلوغرافية
العنوان: RotRNN: Modelling Long Sequences with Rotations
المؤلفون: Dolga, Rares, Biegun, Kai, Cunningham, Jake, Barber, David
سنة النشر: 2024
المجموعة: Computer Science
Statistics
مصطلحات موضوعية: Computer Science - Machine Learning, Statistics - Machine Learning
الوصف: Linear recurrent models, such as State Space Models (SSMs) and Linear Recurrent Units (LRUs), have recently shown state-of-the-art performance on long sequence modelling benchmarks. Despite their success, they come with a number of drawbacks, most notably their complex initialisation and normalisation schemes. In this work, we address some of these issues by proposing RotRNN -- a linear recurrent model which utilises the convenient properties of rotation matrices. We show that RotRNN provides a simple model with fewer theoretical assumptions than prior works, with a practical implementation that remains faithful to its theoretical derivation, achieving comparable scores to the LRU and SSMs on several long sequence modelling datasets.
Comment: Next Generation of Sequence Modeling Architectures Workshop at ICML 2024
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2407.07239
رقم الأكسشن: edsarx.2407.07239
قاعدة البيانات: arXiv