HELIX

التفاصيل البيبلوغرافية
العنوان: HELIX
المؤلفون: Jialin Liu, Doris Xin, Stephen Macke, Aditya Parameswaran, Shuchen Song, Litian Ma
المصدر: Proceedings of the VLDB Endowment. 12:446-460
بيانات النشر: Association for Computing Machinery (ACM), 2018.
سنة النشر: 2018
مصطلحات موضوعية: FOS: Computer and information sciences, Computer Science - Machine Learning, Process (engineering), Scala, Computer science, 02 engineering and technology, Machine learning, computer.software_genre, Machine Learning (cs.LG), Computer Science - Databases, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Use case, computer.programming_language, Iterative and incremental development, Syntax (programming languages), business.industry, General Engineering, Databases (cs.DB), Workflow, 020201 artificial intelligence & image processing, Data pre-processing, Artificial intelligence, Heuristics, business, computer
الوصف: Machine learning workflow development is a process of trial-and-error: developers iterate on workflows by testing out small modifications until the desired accuracy is achieved. Unfortunately, existing machine learning systems focus narrowly on model training---a small fraction of the overall development time---and neglect to address iterative development. We propose H elix , a machine learning system that optimizes the execution across iterations ---intelligently caching and reusing, or recomputing intermediates as appropriate. H elix captures a wide variety of application needs within its Scala DSL, with succinct syntax defining unified processes for data preprocessing, model specification, and learning. We demonstrate that the reuse problem can be cast as a M ax -F low problem, while the caching problem is NP-H ard . We develop effective lightweight heuristics for the latter. Empirical evaluation shows that H elix is not only able to handle a wide variety of use cases in one unified workflow but also much faster, providing run time reductions of up to 19x over state-of-the-art systems, such as DeepDive or KeystoneML, on four real-world applications in natural language processing, computer vision, social and natural sciences.
تدمد: 2150-8097
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::50dc24312cf561bfd88e45c77a4b7f66
https://doi.org/10.14778/3297753.3297763
حقوق: OPEN
رقم الأكسشن: edsair.doi.dedup.....50dc24312cf561bfd88e45c77a4b7f66
قاعدة البيانات: OpenAIRE