Efficient knowledge distillation of teacher model to multiple student models

التفاصيل البيبلوغرافية
العنوان: Efficient knowledge distillation of teacher model to multiple student models
المؤلفون: Satheesh Kumar Perepu, Vidya Ganesh, Thrivikram Gl, T V Sethuraman
المصدر: 2021 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology (IAICT).
بيانات النشر: IEEE, 2021.
سنة النشر: 2021
مصطلحات موضوعية: Edge device, business.industry, Computer science, media_common.quotation_subject, Deep learning, Knowledge engineering, Machine learning, computer.software_genre, Task (project management), Memory management, ComputingMilieux_COMPUTERSANDEDUCATION, Artificial intelligence, business, Function (engineering), Set (psychology), Knowledge transfer, computer, media_common
الوصف: Deep learning models are proven to deliver satisfactory results on training a complex non-linear relationship between the set of input features and different task outputs. However, they are memory intensive and require good computational power for both training as well as inferencing. In literature one can find different model compression techniques which enables easy deployment on edge devices. Knowledge distillation is one such approach where the knowledge of complex teacher model is transferred to a lower parameter student model. However, the limitation is that the architecture of the student model should be comparable to the complex teacher model for better knowledge transfer. Due to this limitation, we cannot deploy this student model that learns from a complex and huge teacher on edge devices. In this work, we propose to use a combined student approach wherein different student models learn from a common teacher model. Further, we propose a unique loss function which will train multiple student models simultaneously. An advantage of this approach is that these student models can be as simple as possible when compared with traditional single student model and also the complex teacher model. Finally, we provide an extensive evaluation to prove that our approach can improve the overall accuracy significantly and allow a further compression by 10% when compared with generic model.
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::8a9a38896e268646abdd86b33398209d
https://doi.org/10.1109/iaict52856.2021.9532543
حقوق: CLOSED
رقم الأكسشن: edsair.doi...........8a9a38896e268646abdd86b33398209d
قاعدة البيانات: OpenAIRE