تقرير
FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks
العنوان: | FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks |
---|---|
المؤلفون: | Blott, Michaela, Preusser, Thomas, Fraser, Nicholas, Gambardella, Giulio, O'Brien, Kenneth, Umuroglu, Yaman |
سنة النشر: | 2018 |
المجموعة: | Computer Science |
مصطلحات موضوعية: | Computer Science - Hardware Architecture |
الوصف: | Convolutional Neural Networks have rapidly become the most successful machine learning algorithm, enabling ubiquitous machine vision and intelligent decisions on even embedded computing-systems. While the underlying arithmetic is structurally simple, compute and memory requirements are challenging. One of the promising opportunities is leveraging reduced-precision representations for inputs, activations and model parameters. The resulting scalability in performance, power efficiency and storage footprint provides interesting design compromises in exchange for a small reduction in accuracy. FPGAs are ideal for exploiting low-precision inference engines leveraging custom precisions to achieve the required numerical accuracy for a given application. In this article, we describe the second generation of the FINN framework, an end-to-end tool which enables design space exploration and automates the creation of fully customized inference engines on FPGAs. Given a neural network description, the tool optimizes for given platforms, design targets and a specific precision. We introduce formalizations of resource cost functions and performance predictions, and elaborate on the optimization algorithms. Finally, we evaluate a selection of reduced precision neural networks ranging from CIFAR-10 classifiers to YOLO-based object detection on a range of platforms including PYNQ and AWS\,F1, demonstrating new unprecedented measured throughput at 50TOp/s on AWS-F1 and 5TOp/s on embedded devices. Comment: to be published in ACM TRETS Special Edition on Deep Learning |
نوع الوثيقة: | Working Paper |
URL الوصول: | http://arxiv.org/abs/1809.04570 |
رقم الأكسشن: | edsarx.1809.04570 |
قاعدة البيانات: | arXiv |
الوصف غير متاح. |