دورية أكاديمية

Error-Bounded Learned Scientific Data Compression with Preservation of Derived Quantities

التفاصيل البيبلوغرافية
العنوان: Error-Bounded Learned Scientific Data Compression with Preservation of Derived Quantities
المؤلفون: Jaemoon Lee, Qian Gong, Jong Choi, Tania Banerjee, Scott Klasky, Sanjay Ranka, Anand Rangarajan
المصدر: Applied Sciences, Vol 12, Iss 13, p 6718 (2022)
بيانات النشر: MDPI AG, 2022.
سنة النشر: 2022
المجموعة: LCC:Technology
LCC:Engineering (General). Civil engineering (General)
LCC:Biology (General)
LCC:Physics
LCC:Chemistry
مصطلحات موضوعية: data compression, autoencoders, error guarantees, moment preservation, constraint satisfaction, quantization, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
الوصف: Scientific applications continue to grow and produce extremely large amounts of data, which require efficient compression algorithms for long-term storage. Compression errors in scientific applications can have a deleterious impact on downstream processing. Thus, it is crucial to preserve all the “known” Quantities of Interest (QoI) during compression. To address this issue, most existing approaches guarantee the reconstruction error of the original data or primary data (PD), but cannot directly control the problem of preserving the QoI. In this work, we propose a physics-informed compression technique that is composed of two parts: (i) reduction of the PD with bounded errors and (ii) preservation of the QoI. In the first step, we combine tensor decompositions, autoencoders, product quantizers, and error-bounded lossy compressors to bound the reconstruction error at high levels of compression. In the second step, we use constraint satisfaction post-processing followed by quantization to preserve the QoI. To illustrate the challenges of reducing the reconstruction errors of the PD and QoI, we focus on simulation data generated by a large-scale fusion code, XGC, which can produce tens of petabytes in a single day. The results show that our approach can achieve a high compression amount while accurately preserving the QoI within scientifically acceptable bounds.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2076-3417
Relation: https://www.mdpi.com/2076-3417/12/13/6718; https://doaj.org/toc/2076-3417
DOI: 10.3390/app12136718
URL الوصول: https://doaj.org/article/0d896854ba8e4812a9c2b54fdacca252
رقم الأكسشن: edsdoj.0d896854ba8e4812a9c2b54fdacca252
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:20763417
DOI:10.3390/app12136718