نتائج البحث - "Papka, Michael E"

1

تقرير

Science in a Blink: Supporting Ensemble Perception in Scalar Fields

المؤلفون: Mateevitsi, Victor A., Papka, Michael E., Reda, Khairi

مصطلحات موضوعية: Computer Science - Human-Computer Interaction

الوصف: Visualizations support rapid analysis of scientific datasets, allowing viewers to glean aggregate information (e.g., the mean) within split-seconds. While prior research has explored this ability in conventional charts, it is unclear if spatial visualizations used by computational scientists afford a similar ensemble perception capacity. We investigate people's ability to estimate two summary statistics, mean and variance, from pseudocolor scalar fields. In a crowdsourced experiment, we find that participants can reliably characterize both statistics, although variance discrimination requires a much stronger signal. Multi-hue and diverging colormaps outperformed monochromatic, luminance ramps in aiding this extraction. Analysis of qualitative responses suggests that participants often estimate the distribution of hotspots and valleys as visual proxies for data statistics. These findings suggest that people's summary interpretation of spatial datasets is likely driven by the appearance of discrete color segments, rather than assessments of overall luminance. Implicit color segmentation in quantitative displays could thus prove more useful than previously assumed by facilitating quick, gist-level judgments about color-coded visualizations.
Comment: To appear in Proceedings of the 2024 IEEE Visualization Conference (VIS'24)

URL الوصول: http://arxiv.org/abs/2406.14452

2

تقرير

VisAnywhere: Developing Multi-platform Scientific Visualization Applications

المؤلفون: Marrinan, Thomas, Moeller, Madeleine, Kanayinkal, Alina, Mateevitsi, Victor A., Papka, Michael E.

مصطلحات موضوعية: Computer Science - Human-Computer Interaction, Computer Science - Graphics

الوصف: Scientists often explore and analyze large-scale scientific simulation data by leveraging two- and three-dimensional visualizations. The data and tasks can be complex and therefore best supported using myriad display technologies, from mobile devices to large high-resolution display walls to virtual reality headsets. Using a simulation of neuron connections in the human brain, we present our work leveraging various web technologies to create a multi-platform scientific visualization application. Users can spread visualization and interaction across multiple devices to support flexible user interfaces and both co-located and remote collaboration. Drawing inspiration from responsive web design principles, this work demonstrates that a single codebase can be adapted to develop scientific visualization applications that operate everywhere.

URL الوصول: http://arxiv.org/abs/2404.17619

3

تقرير

MalleTrain: Deep Neural Network Training on Unfillable Supercomputer Nodes

المؤلفون: Ma, Xiaolong, Yan, Feng, Yang, Lei, Foster, Ian, Papka, Michael E., Liu, Zhengchun, Kettimuthu, Rajkumar

مصطلحات موضوعية: Computer Science - Distributed, Parallel, and Cluster Computing

الوصف: First-come first-serve scheduling can result in substantial (up to 10%) of transiently idle nodes on supercomputers. Recognizing that such unfilled nodes are well-suited for deep neural network (DNN) training, due to the flexible nature of DNN training tasks, Liu et al. proposed that the re-scaling DNN training tasks to fit gaps in schedules be formulated as a mixed-integer linear programming (MILP) problem, and demonstrated via simulation the potential benefits of the approach. Here, we introduce MalleTrain, a system that provides the first practical implementation of this approach and that furthermore generalizes it by allowing it use even for DNN training applications for which model information is unknown before runtime. Key to this latter innovation is the use of a lightweight online job profiling advisor (JPA) to collect critical scalability information for DNN jobs -- information that it then employs to optimize resource allocations dynamically, in real time. We describe the MalleTrain architecture and present the results of a detailed experimental evaluation on a supercomputer GPU cluster and several representative DNN training workloads, including neural architecture search and hyperparameter optimization. Our results not only confirm the practical feasibility of leveraging idle supercomputer nodes for DNN training but improve significantly on prior results, improving training throughput by up to 22.3\% without requiring users to provide job scalability information.

URL الوصول: http://arxiv.org/abs/2404.15668

4

تقرير

Interpretable Modeling of Deep Reinforcement Learning Driven Scheduling

المؤلفون: Li, Boyang, Lan, Zhiling, Papka, Michael E.

مصطلحات موضوعية: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Distributed, Parallel, and Cluster Computing

الوصف: In the field of high-performance computing (HPC), there has been recent exploration into the use of deep reinforcement learning for cluster scheduling (DRL scheduling), which has demonstrated promising outcomes. However, a significant challenge arises from the lack of interpretability in deep neural networks (DNN), rendering them as black-box models to system managers. This lack of model interpretability hinders the practical deployment of DRL scheduling. In this work, we present a framework called IRL (Interpretable Reinforcement Learning) to address the issue of interpretability of DRL scheduling. The core idea is to interpret DNN (i.e., the DRL policy) as a decision tree by utilizing imitation learning. Unlike DNN, decision tree models are non-parametric and easily comprehensible to humans. To extract an effective and efficient decision tree, IRL incorporates the Dataset Aggregation (DAgger) algorithm and introduces the notion of critical state to prune the derived decision tree. Through trace-based experiments, we demonstrate that IRL is capable of converting a black-box DNN policy into an interpretable rulebased decision tree while maintaining comparable scheduling performance. Additionally, IRL can contribute to the setting of rewards in DRL scheduling.

URL الوصول: http://arxiv.org/abs/2403.16293

5

مؤتمر

Interpretable Modeling of Deep Reinforcement Learning Driven Scheduling

المؤلفون: Li, Boyang, Lan, Zhiling, Papka, Michael E.

المصدر: 2023 31st International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS) Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), 2023 31st International Symposium on. :1-8 Oct, 2023

Relation: 2023 31st International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)

Full Text Finder

6

مؤتمر

FreeTrain: A Framework to Utilize Unused Supercomputer Nodes for Training Neural Networks

المؤلفون: Liu, Zhengchun, Kettimuthu, Rajkumar, Papka, Michael E., Foster, Ian

المصدر: 2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid) CCGRID Cluster, Cloud and Internet Computing (CCGrid), 2023 IEEE/ACM 23rd International Symposium on. :299-310 May, 2023

Relation: 2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid)

7

تقرير

Color Maker: a Mixed-Initiative Approach to Creating Accessible Color Maps

المؤلفون: Salvi, Amey, Lu, Kecheng, Papka, Michael E., Wang, Yunhai, Reda, Khairi

مصطلحات موضوعية: Computer Science - Human-Computer Interaction

الوصف: Quantitative data is frequently represented using color, yet designing effective color mappings is a challenging task, requiring one to balance perceptual standards with personal color preference. Current design tools either overwhelm novices with complexity or offer limited customization options. We present ColorMaker, a mixed-initiative approach for creating colormaps. ColorMaker combines fluid user interaction with real-time optimization to generate smooth, continuous color ramps. Users specify their loose color preferences while leaving the algorithm to generate precise color sequences, meeting both designer needs and established guidelines. ColorMaker can create new colormaps, including designs accessible for people with color-vision deficiencies, starting from scratch or with only partial input, thus supporting ideation and iterative refinement. We show that our approach can generate designs with similar or superior perceptual characteristics to standard colormaps. A user study demonstrates how designers of varying skill levels can use this tool to create custom, high-quality colormaps. ColorMaker is available at https://colormaker.org
Comment: To appear at the ACM CHI '24 Conference on Human Factors in Computing Systems

URL الوصول: http://arxiv.org/abs/2401.15032

8

تقرير

Scaling Computational Fluid Dynamics: In Situ Visualization of NekRS using SENSEI

المؤلفون: Mateevitsi, Victor A., Bode, Mathis, Ferrier, Nicola, Fischer, Paul, Göbbert, Jens Henrik, Insley, Joseph A., Lan, Yu-Hsiang, Min, Misun, Papka, Michael E., Patel, Saumil, Rizzi, Silvio, Windgassen, Jonathan

مصطلحات موضوعية: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Performance

الوصف: In the realm of Computational Fluid Dynamics (CFD), the demand for memory and computation resources is extreme, necessitating the use of leadership-scale computing platforms for practical domain sizes. This intensive requirement renders traditional checkpointing methods ineffective due to the significant slowdown in simulations while saving state data to disk. As we progress towards exascale and GPU-driven High-Performance Computing (HPC) and confront larger problem sizes, the choice becomes increasingly stark: to compromise data fidelity or to reduce resolution. To navigate this challenge, this study advocates for the use of in situ analysis and visualization techniques. These allow more frequent data "snapshots" to be taken directly from memory, thus avoiding the need for disruptive checkpointing. We detail our approach of instrumenting NekRS, a GPU-focused thermal-fluid simulation code employing the spectral element method (SEM), and describe varied in situ and in transit strategies for data rendering. Additionally, we provide concrete scientific use-cases and report on runs performed on Polaris, Argonne Leadership Computing Facility's (ALCF) 44 Petaflop supercomputer and J\"ulich Wizard for European Leadership Science (JUWELS) Booster, J\"ulich Supercomputing Centre's (JSC) 71 Petaflop High Performance Computing (HPC) system, offering practical insight into the implications of our methodology.

URL الوصول: http://arxiv.org/abs/2312.09888

9

تقرير

DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies

المؤلفون: Song, Shuaiwen Leon, Kruft, Bonnie, Zhang, Minjia, Li, Conglong, Chen, Shiyang, Zhang, Chengming, Tanaka, Masahiro, Wu, Xiaoxia, Rasley, Jeff, Awan, Ammar Ahmad, Holmes, Connor, Cai, Martin, Ghanem, Adam, Zhou, Zhongzhu, He, Yuxiong, Luferenko, Pete, Kumar, Divya, Weyn, Jonathan, Zhang, Ruixiong, Klocek, Sylwester, Vragov, Volodymyr, AlQuraishi, Mohammed, Ahdritz, Gustaf, Floristean, Christina, Negri, Cristina, Kotamarthi, Rao, Vishwanath, Venkatram, Ramanathan, Arvind, Foreman, Sam, Hippe, Kyle, Arcomano, Troy, Maulik, Romit, Zvyagin, Maxim, Brace, Alexander, Zhang, Bin, Bohorquez, Cindy Orozco, Clyde, Austin, Kale, Bharat, Perez-Rivera, Danilo, Ma, Heng, Mann, Carla M., Irvin, Michael, Pauloski, J. Gregory, Ward, Logan, Hayot, Valerie, Emani, Murali, Xie, Zhen, Lin, Diangen, Shukla, Maulik, Foster, Ian, Davis, James J., Papka, Michael E., Brettin, Thomas, Balaprakash, Prasanna, Tourassi, Gina, Gounley, John, Hanson, Heidi, Potok, Thomas E, Pasini, Massimiliano Lupo, Evans, Kate, Lu, Dan, Lunga, Dalton, Yin, Junqi, Dash, Sajal, Wang, Feiyi, Shankar, Mallikarjun, Lyngaas, Isaac, Wang, Xiao, Cong, Guojing, Zhang, Pei, Fan, Ming, Liu, Siyan, Hoisie, Adolfy, Yoo, Shinjae, Ren, Yihui, Tang, William, Felker, Kyle, Svyatkovskiy, Alexey, Liu, Hang, Aji, Ashwin, Dalton, Angela, Schulte, Michael, Schulz, Karl, Deng, Yuntian, Nie, Weili, Romero, Josh, Dallago, Christian, Vahdat, Arash, Xiao, Chaowei, Gibbs, Thomas, Anandkumar, Anima, Stevens, Rick

مصطلحات موضوعية: Computer Science - Artificial Intelligence, Computer Science - Machine Learning

الوصف: In the upcoming decade, deep learning may revolutionize the natural sciences, enhancing our capacity to model and predict natural occurrences. This could herald a new era of scientific exploration, bringing significant advancements across sectors from drug development to renewable energy. To answer this call, we present DeepSpeed4Science initiative (deepspeed4science.ai) which aims to build unique capabilities through AI system technology innovations to help domain experts to unlock today's biggest science mysteries. By leveraging DeepSpeed's current technology pillars (training, inference and compression) as base technology enablers, DeepSpeed4Science will create a new set of AI system technologies tailored for accelerating scientific discoveries by addressing their unique complexity beyond the common technical approaches used for accelerating generic large language models (LLMs). In this paper, we showcase the early progress we made with DeepSpeed4Science in addressing two of the critical system challenges in structural biology research.

URL الوصول: http://arxiv.org/abs/2310.04610

10

تقرير

A Comprehensive Performance Study of Large Language Models on Novel AI Accelerators

المؤلفون: Emani, Murali, Foreman, Sam, Sastry, Varuni, Xie, Zhen, Raskar, Siddhisanket, Arnold, William, Thakur, Rajeev, Vishwanath, Venkatram, Papka, Michael E.

مصطلحات موضوعية: Computer Science - Performance, Computer Science - Artificial Intelligence, Computer Science - Hardware Architecture, Computer Science - Machine Learning

الوصف: Artificial intelligence (AI) methods have become critical in scientific applications to help accelerate scientific discovery. Large language models (LLMs) are being considered as a promising approach to address some of the challenging problems because of their superior generalization capabilities across domains. The effectiveness of the models and the accuracy of the applications is contingent upon their efficient execution on the underlying hardware infrastructure. Specialized AI accelerator hardware systems have recently become available for accelerating AI applications. However, the comparative performance of these AI accelerators on large language models has not been previously studied. In this paper, we systematically study LLMs on multiple AI accelerators and GPUs and evaluate their performance characteristics for these models. We evaluate these systems with (i) a micro-benchmark using a core transformer block, (ii) a GPT- 2 model, and (iii) an LLM-driven science use case, GenSLM. We present our findings and analyses of the models' performance to better understand the intrinsic capabilities of AI accelerators. Furthermore, our analysis takes into account key factors such as sequence lengths, scaling behavior, sparsity, and sensitivity to gradient accumulation steps.

URL الوصول: http://arxiv.org/abs/2310.04607

تنقيح النتائج