Exploring the Frontiers of Energy Efficiency using Power Management at System Scale

التفاصيل البيبلوغرافية
العنوان: Exploring the Frontiers of Energy Efficiency using Power Management at System Scale
المؤلفون: Karimi, Ahmad Maroof, Maiterth, Matthias, Shin, Woong, Sattar, Naw Safrin, Lu, Hao, Wang, Feiyi
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Distributed, Parallel, and Cluster Computing
الوصف: In the face of surging power demands for exascale HPC systems, this work tackles the critical challenge of understanding the impact of software-driven power management techniques like Dynamic Voltage and Frequency Scaling (DVFS) and Power Capping. These techniques have been actively developed over the past few decades. By combining insights from GPU benchmarking to understand application power profiles, we present a telemetry data-driven approach for deriving energy savings projections. This approach has been demonstrably applied to the Frontier supercomputer at scale. Our findings based on three months of telemetry data indicate that, for certain resource-constrained jobs, significant energy savings (up to 8.5%) can be achieved without compromising performance. This translates to a substantial cost reduction, equivalent to 1438 MWh of energy saved. The key contribution of this work lies in the methodology for establishing an upper limit for these best-case scenarios and its successful application. This work sheds light on potential energy savings and empowers HPC professionals to optimize the power-performance trade-off within constrained power budgets, not only for the exascale era but also beyond.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2408.01552
رقم الأكسشن: edsarx.2408.01552
قاعدة البيانات: arXiv