System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models

التفاصيل البيبلوغرافية
العنوان: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
المؤلفون: Jacobs, Sam Ade, Tanaka, Masahiro, Zhang, Chengming, Zhang, Minjia, Aminabadi, Reza Yazdani, Song, Shuaiwen Leon, Rajbhandari, Samyam, He, Yuxiong
المصدر: 2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) IPDPSW Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2024 IEEE International. :1206-1208 May, 2024
Relation: 2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
قاعدة البيانات: IEEE Xplore Digital Library
الوصف
ردمك:9798350364606
DOI:10.1109/IPDPSW63119.2024.00208