The Data Calculator

التفاصيل البيبلوغرافية
العنوان: The Data Calculator
المؤلفون: Stratos Idreos, Demi Guo, Michael S. Kester, Kostas Zoumpatianos, Brian Hentschel
المصدر: SIGMOD Conference
بيانات النشر: ACM, 2018.
سنة النشر: 2018
مصطلحات موضوعية: Computer science, Computation, 020207 software engineering, Workload, 02 engineering and technology, Data structure, law.invention, Set (abstract data type), Data access, Computer engineering, Calculator, Software deployment, law, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering
الوصف: Data structures are critical in any data-driven scenario, but they are notoriously hard to design due to a massive design space and the dependence of performance on workload and hardware which evolve continuously. We present a design engine, the Data Calculator, which enables interactive and semi-automated design of data structures. It brings two innovations. First, it offers a set of fine-grained design primitives that capture the first principles of data layout design: how data structure nodes lay data out, and how they are positioned relative to each other. This allows for a structured description of the universe of possible data structure designs that can be synthesized as combinations of those primitives. The second innovation is computation of performance using learned cost models. These models are trained on diverse hardware and data profiles and capture the cost properties of fundamental data access primitives (e.g., random access). With these models, we synthesize the performance cost of complex operations on arbitrary data structure designs without having to: 1) implement the data structure, 2) run the workload, or even 3) access the target hardware. We demonstrate that the Data Calculator can assist data structure designers and researchers by accurately answering rich what-if design questions on the order of a few seconds or minutes, i.e., computing how the performance (response time) of a given data structure design is impacted by variations in the: 1) design, 2) hardware, 3) data, and 4) query workloads. This makes it effortless to test numerous designs and ideas before embarking on lengthy implementation, deployment, and hardware acquisition steps. We also demonstrate that the Data Calculator can synthesize entirely new designs, auto-complete partial designs, and detect suboptimal design choices.
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::ff39f8040dc43c35ee72e80752ecccd9
https://doi.org/10.1145/3183713.3199671
حقوق: OPEN
رقم الأكسشن: edsair.doi...........ff39f8040dc43c35ee72e80752ecccd9
قاعدة البيانات: OpenAIRE