Recurrent knowledge distillation

التفاصيل البيبلوغرافية
العنوان: Recurrent knowledge distillation
المؤلفون: Pintea, Silvia L., Liu, Yue, van Gemert, Jan C.
المصدر: 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) IEEE International Conference on Image Processing ICIP. :3393-3397
مصطلحات موضوعية: Knowledge distillation, compacting deep representations for image classification, recurrent layers, NG J, PROC CVPR IEEE, P248 J, NIPS, P2654 X, CVPR, P7370 ringenberg J T, CoRR, ng Yunhe, ICML, P3703 izhevsky Alex, Learning multiple layers of features from tiny images, n M, ICLR, m J, P4133 goruyko Sergey, eff K, odfellow I J, JMLR, elhamer Evan, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, V39, P640 bedev Vadim, 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), JUN 27-30, Seattle, WA, P2554 pta Saurabh, P2827 stegari Mohammad, COMPUTER VISION - ECCV 2016, PT IV14th European Conference on Computer Vision (ECCV), OCT 08-16, Amsterdam, NETHERLANDS, V9908, P525 ang Xiangyu, V38, P1943 Kaiming, 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV)IEEE International Conference on Computer Vision, DEC 11-18, Santiago, CHILE, P1026 eng Yu, P2857 ang Ming, 2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)IEEE Conference on Computer Vision and Pattern Recognition (CVPR), JUN 07-12, Boston, MA, P3367 chreiter S, NEURAL COMPUTATION, V9, P1735 attoni Ariadna, CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4IEEE-Computer-Society Conference on Computer Vision and Pattern Recognition Workshops, JUN 20-25, Miami Beach, FL, P413 nton G, NIPS Deep Learning and Representation Learning Workshop, i J, P379
الوصف: Knowledge distillation compacts deep networks by letting a small student network learn from a large teacher network. The accuracy of knowledge distillation recently benefited from adding residual layers. We propose to reduce the size of the student network even further by recasting multiple residual layers in the teacher network into a single recurrent student layer. We propose three variants of adding recurrent connections into the student network, and show experimentally on CIFAR-10, Scenes and MiniPlaces, that we can reduce the number of parameters at little loss in accuracy.
وصف الملف: print
URL الوصول: https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-242213
https://2018.ieeeicip.org/
قاعدة البيانات: SwePub
الوصف
تدمد:15224880
DOI:10.1109/ICIP.2018.8451253