ComScribe: Identifying Intra-node GPU Communication

التفاصيل البيبلوغرافية
العنوان: ComScribe: Identifying Intra-node GPU Communication
المؤلفون: Didem Unat, Fareed Qararyah, Palwisha Akhtar, Erhan Tezcan
المصدر: Benchmarking, Measuring, and Optimizing ISBN: 9783030710576
Bench
بيانات النشر: Springer International Publishing, 2021.
سنة النشر: 2021
مصطلحات موضوعية: Profiling (computer programming), Artificial neural network, Computer architecture, Shared memory, Computer science, Node (networking), Scalability, Benchmark (computing), Volume (computing), Programmer
الوصف: GPU communication plays a critical role in performance and scalability of multi-GPU accelerated applications. With the ever increasing methods and types of communication, it is often hard for the programmer to know the exact amount and type of communication taking place in an application. Though there are prior works that detect communication in distributed systems for MPI and multi-threaded applications on shared memory systems, to our knowledge, none of these works identify intra-node GPU communication. We propose a tool, ComScribe that identifies and categorizes types of communication among all GPU-GPU and CPU-GPU pairs in a node. Built on top of NVIDIA’s profiler nvprof, ComScribe visualizes data movement as a communication matrix or bar-chart for explicit communication primitives, Unified Memory operations, and Zero-copy Memory transfers. To validate our tool on 16 GPUs, we present communication patterns of 8 micro- and 3 macro-benchmarks from NVIDIA, Comm|Scope, and MGBench benchmark suites. To demonstrate tool’s capabilities in real-life applications, we also present insightful communication matrices of two deep neural network models. All in all, ComScribe can guide the programmer in identifying which groups of GPUs communicate in what volume by using which primitives. This offers avenues to detect performance bottlenecks and more importantly communication bugs in an application.
ردمك: 978-3-030-71057-6
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::a06611d9afba6357389a6e49c82b10a1
https://doi.org/10.1007/978-3-030-71058-3_10
حقوق: CLOSED
رقم الأكسشن: edsair.doi...........a06611d9afba6357389a6e49c82b10a1
قاعدة البيانات: OpenAIRE