دورية أكاديمية

Enhancing data locality of the conjugate gradient method for high-order matrix-free finite-element implementations.

التفاصيل البيبلوغرافية
العنوان: Enhancing data locality of the conjugate gradient method for high-order matrix-free finite-element implementations.
المؤلفون: Kronbichler, Martin, Sashko, Dmytro, Munch, Peter
المصدر: International Journal of High Performance Computing Applications; Mar2023, Vol. 37 Issue 2, p61-81, 21p
مصطلحات موضوعية: CONJUGATE gradient methods, DEGREES of freedom, CACHE memory, INTERIOR-point methods
مستخلص: This work investigates a variant of the conjugate gradient (CG) method and embeds it into the context of high-order finite-element schemes with fast matrix-free operator evaluation and cheap preconditioners like the matrix diagonal. Relying on a data-dependency analysis and appropriate enumeration of degrees of freedom, we interleave the vector updates and inner products in a CG iteration with the matrix-vector product with only minor organizational overhead. As a result, around 90% of the vector entries of the three active vectors of the CG method are transferred from slow RAM memory exactly once per iteration, with all additional access hitting fast cache memory. Node-level performance analyses and scaling studies on up to 147k cores show that the CG method with the proposed performance optimizations is around two times faster than a standard CG solver as well as optimized pipelined CG and s -step CG methods for large sizes that exceed processor caches, and provides similar performance near the strong scaling limit. [ABSTRACT FROM AUTHOR]
Copyright of International Journal of High Performance Computing Applications is the property of Sage Publications, Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
قاعدة البيانات: Complementary Index