LEA: A Learned Encoding Advisor for Column Stores

التفاصيل البيبلوغرافية
العنوان: LEA: A Learned Encoding Advisor for Column Stores
المؤلفون: Cen, Lujing, Kipf, Andreas, Marcus, Ryan, Kraska, Tim
سنة النشر: 2021
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Databases
الوصف: Data warehouses organize data in a columnar format to enable faster scans and better compression. Modern systems offer a variety of column encodings that can reduce storage footprint and improve query performance. Selecting a good encoding scheme for a particular column is an optimization problem that depends on the data, the query workload, and the underlying hardware. We introduce Learned Encoding Advisor (LEA), a learned approach to column encoding selection. LEA is trained on synthetic datasets with various distributions on the target system. Once trained, LEA uses sample data and statistics (such as cardinality) from the user's database to predict the optimal column encodings. LEA can optimize for encoded size, query performance, or a combination of the two. Compared to the heuristic-based encoding advisor of a commercial column store on TPC-H, LEA achieves 19% lower query latency while using 26% less space.
نوع الوثيقة: Working Paper
DOI: 10.1145/3464509.3464885
URL الوصول: http://arxiv.org/abs/2105.08830
رقم الأكسشن: edsarx.2105.08830
قاعدة البيانات: arXiv