Exploring Thread Coarsening on FPGA

التفاصيل البيبلوغرافية
العنوان: Exploring Thread Coarsening on FPGA
المؤلفون: Zarch, Mostafa Eghbali, Neff, Reece, Becchi, Michela
سنة النشر: 2022
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Distributed, Parallel, and Cluster Computing
الوصف: Over the past few years, there has been an increased interest in including FPGAs in data centers and high-performance computing clusters along with GPUs and other accelerators. As a result, it has become increasingly important to have a unified, high-level programming interface for CPUs, GPUs and FPGAs. This has led to the development of compiler toolchains to deploy OpenCL code on FPGA. However, the fundamental architectural differences between GPUs and FPGAs have led to performance portability issues: it has been shown that OpenCL code optimized for GPU does not necessarily map well to FPGA, often requiring manual optimizations to improve performance. In this paper, we explore the use of thread coarsening - a compiler technique that consolidates the work of multiple threads into a single thread - on OpenCL code running on FPGA. While this optimization has been explored on CPU and GPU, the architectural features of FPGAs and the nature of the parallelism they offer lead to different performance considerations, making an analysis of thread coarsening on FPGA worthwhile. Our evaluation, performed on our microbenchmarks and on a set of applications from open-source benchmark suites, shows that thread coarsening can yield performance benefits (up to 3-4x speedups) to OpenCL code running on FPGA at a limited resource utilization cost.
نوع الوثيقة: Working Paper
DOI: 10.1109/HIPC53243.2021.00062
URL الوصول: http://arxiv.org/abs/2208.11890
رقم الأكسشن: edsarx.2208.11890
قاعدة البيانات: arXiv
الوصف
DOI:10.1109/HIPC53243.2021.00062