تقرير
SPA: Towards A Computational Friendly Cloud-Base and On-Devices Collaboration Seq2seq Personalized Generation
العنوان: | SPA: Towards A Computational Friendly Cloud-Base and On-Devices Collaboration Seq2seq Personalized Generation |
---|---|
المؤلفون: | Liu, Yanming, Peng, Xinyue, Cao, Jiannan, Dai, Le, Liu, Xingzu, Nong, Ruilin, Liu, Weihao |
سنة النشر: | 2024 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Computation and Language |
الوصف: | Large language models(LLMs) have shown its outperforming ability on various tasks and question answering. However, LLMs require substantial memory storage on low-resource devices. More critically, the computational speed on these devices is also severely limited. In this paper, we propose SPA(Side Plugin Adaption), a lightweight architecture for fast on-devices inference on the constraints of strict on-devices computation and memory constraints. Compared with other on-devices seq2seq generation, SPA could make a fast and stable inference on low-resource constraints, allowing it to obtain cost effiency. Our method establish an interaction between a pretrained LLMs on-cloud and additive parameters on-devices, which could provide the knowledge on both pretrained LLMs and featured personal feature. Further more, SPA provides a framework to keep feature-base parameters on low computational devices while leave the parameters containing general information on the high computational devices. Comment: 15 pages, second version of SPA(Side Plugin Adaption) |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/2403.07088 |
رقم الأكسشن: | edsarx.2403.07088 |
قاعدة البيانات: | arXiv |
كن أول من يترك تعليقا!