دورية أكاديمية

A Performance-Stable NUMA Management Scheme for Linux-Based HPC Systems

التفاصيل البيبلوغرافية
العنوان: A Performance-Stable NUMA Management Scheme for Linux-Based HPC Systems
المؤلفون: Jaehyun Song, Minwoo Ahn, Gyusun Lee, Euiseong Seo, Jinkyu Jeong
المصدر: IEEE Access, Vol 9, Pp 52987-53002 (2021)
بيانات النشر: IEEE, 2021.
سنة النشر: 2021
المجموعة: LCC:Electrical engineering. Electronics. Nuclear engineering
مصطلحات موضوعية: High-performance computing, Linux, non-uniform memory access, OS noise, performance stability, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
الوصف: Linux is becoming the de-facto standard operating system for today’s high-performance computing (HPC) systems because it can satisfy the demands of many HPC systems for rich operating system (OS) features. However, owing to features intended for the general-purpose OS, Linux has many OS noise sources such as page faults or thread migrations that can result in the unstable performance of HPC application. Furthermore, in the case of the non-uniform memory access (NUMA) architecture, which has different memory access latencies to local and remote memory nodes, the performance stability of the application can be more exacerbated by the OS noise. In this paper, we address the OS noise caused by Linux in the NUMA architecture and propose a novel performance-stable NUMA management scheme called Stable-NUMA. Stable-NUMA comprises three techniques for improving performance stability: two-level thread clustering, state-based page placement, and selective page profiling. Our proposed Stable-NUMA scheme significantly alleviates OS noise and enhances the local memory access ratio of the NUMA system as compared to Linux. We implemented Stable-NUMA in Linux and experimented with various HPC workloads. The evaluation results demonstrated that Stable-NUMA outperforms Linux with and without its NUMA-aware feature by up to 25% in terms of average performance and 73% in terms of performance stability.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2169-3536
Relation: https://ieeexplore.ieee.org/document/9391657/; https://doaj.org/toc/2169-3536
DOI: 10.1109/ACCESS.2021.3069991
URL الوصول: https://doaj.org/article/a2c18b166d004cf681c3219129524e09
رقم الأكسشن: edsdoj.2c18b166d004cf681c3219129524e09
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:21693536
DOI:10.1109/ACCESS.2021.3069991