BD2K ERuDIte

التفاصيل البيبلوغرافية
العنوان: BD2K ERuDIte
المؤلفون: José Luis Ambite, Lily Fierro, Kristina Lerman, John D. Van Horn, Gully A. P. C. Burns, Florian Geigl, Jonathan Gordon
المصدر: WWW (Companion Volume)
بيانات النشر: ACM Press, 2017.
سنة النشر: 2017
مصطلحات موضوعية: 0301 basic medicine, Computer science, business.industry, 05 social sciences, Big data, computer.software_genre, Data science, Field (computer science), World Wide Web, 03 medical and health sciences, 030104 developmental biology, Index (publishing), Data extraction, Web page, Social media, 0509 other social sciences, 050904 information & library sciences, business, computer, Information integration, Data integration
الوصف: The field of data science has developed over the years to enable the efficient integration and analysis of the increasingly large amounts of data being generated across many domains, ranging from social media, to sensor networks, to scientific experiments. Numerous subfields of biology and medicine, such as genetics, neuroimaging, and mobile health, are witnessing a data explosion that promises to revolutionize biomedical science by yielding novel insights and discoveries. To address the challenges posed by biomedical big data, the National Institutes of Health (NIH) launched the Big Data to Knowledge (BD2K) initiative (datascience.nih.gov). An important component of this effort is the training of biomedical researchers. To this end, the NIH has funded the BD2K Training Coordinating Center (TCC). A core activity of the BD2K TCC is to develop a web portal (bigdatau.org) to provide personalized training in data science to biomedical researchers. In this paper, we describe our approach and initial efforts in constructing ERuDIte, the Educational Resource Discovery Index for Data Science, which powers the BD2K TCC web portal. ERuDIte harvests a wealth of resources available online for learning data science, both for beginners and experts, including massive open online courses (MOOCs), videos of tutorials and research talks presented at conferences, textbooks, blog posts, and standalone web pages. Though the potential volume of resources is exciting, these online learning materials are highly heterogeneous in quality, difficulty, format, and topic. As a result, this mix of content makes the field intimidating to enter and difficult to navigate. Moreover, data science is a rapidly evolving field, so there is a constant influx of new materials and concepts. ERuDIte leverages data science techniques to build the data science index. This paper describes how ERuDIte uses data extraction, data integration, machine learning, information retrieval, and natural language processing techniques to automatically collect, integrate, describe and organize existing online resources for learning data science.
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::24cc7b54a5e6dca6de31c691fe7cdf31
https://doi.org/10.1145/3041021.3053060
حقوق: CLOSED
رقم الأكسشن: edsair.doi...........24cc7b54a5e6dca6de31c691fe7cdf31
قاعدة البيانات: OpenAIRE