HPC-Moodle: Training Materials Repository

Introducing Delta

June 8, 2021 | Presented by NCSA Staff - Greg Bauer, Brett Bode, Tim Boerner

This webinar provides information on the Delta system's architecture (CPU, GPU, network, and storage), Delta allocations, and the Delta early access program.

Video URL Presentation slides

Illinois Campus Cluster: Basics of Access and Usage Workshop

September 7, 2023 | Presented by Bruno Abreu, NCSA

This tutorial introduces how to use the Illinois Campus Cluster to run scientific application code. It begins with a lesson on the fundamental concepts of scientific computing on a High-Performance Computing (HPC) cluster. The tutorial then progresses through how to access the Illinois Campus Cluster, manage files, edit files, set up your software environment, and transfer files. Finally, you will complete a hands-on exercise where you will run an example application. Even if you are not a Campus Cluster user, by taking this tutorial, you can still learn about using an HPC cluster for scientific computing.

Introduction to Parallel Programming with OpenMP

February 9, 2022 | Presented by Bruno Abreu

This is a 2-hour workshop where participants will work on exercises to parallelize model scientific applications using OpenMP, a shared-memory application programming interface. It will build on the previous Introduction to Parallel Computing on High-Performance Systems workshop that explored concepts and tools such as parallel loop scheduling, explicit data declarations, reduction clauses, and OpenMP library functions. It will be entirely hands-on, with granted access to a supercomputing cluster via a supporting XSEDE allocation.

You will learn how to:

Balance workloads to threads in parallel loops using different schedules (static, dynamic, guided) and tune your loop parallelization for optimal performance
Explicitly declare data contexts (private, shared) to avoid race conditions and improve the quality of your code, facilitating collaborations
Use reduction clauses (additions, maximum and minimum values) to calculate properties without thread concurrency
Use OpenMP library functions to span teams of threads, mix data and task parallelism, insert barriers, and much more

Pre-requisites:

Basic C/C++ or Fortran programming skills
Basic parallel programming knowledge
Basic Linux skills (e.g., compiling code, navigating the file system)
Familiarity with a remote Linux server text editor: vi, nano, or emacs

Illinois Campus Cluster: Basics of Access and Usage Workshop

October 10, 2021 | Presented by Bruno Abreu

This 90-minute workshop addresses the basics of accessing and using the Illinois Campus Cluster (ICC). The content is suitable for researchers and students who already have experience working with clusters and want to know more about the ICC and those who have little to no experience working with clusters and want to learn how to use one.

The workshop consists of three sections:

A presentation introducing the ICC program and the ICC's technical and physical resources, capabilities, and organization (20 min)
Hands-on exercises teaching how to access the ICC, manage the environment and run jobs/software (55 min)
Q&A session (15 min)

Introduction to Parallel Computing on High-Performance Systems

November 17, 2021 | Presented by Bruno Abreu, NCSA

This workshop is a 2-hour introduction to parallel computing on high-performance systems. Core concepts to be presented include:

terminology
programming models
system architecture
data and task parallelism
performance measurement

Hands-on exercises using OpenMP will explore how to build new parallel applications and transform serial applications into parallel ones incrementally in a shared memory environment. (OpenMP is a standardized API for parallelizing Fortran, C, and C++ programs on shared-memory architectures.)

On completion of this training, students will be prepared for more advanced or different parallel computing tools and techniques that build on these core concepts.

Introduction to Parallel Programming with MPI

April 14, 2022 | Presented by Bruno Abreu

Learn how to use the Message Passing Interface (MPI), the standard framework for parallel computing in distributed-memory systems, to parallelize your scientific applications. You will learn the basic concepts of message passing, including domain decomposition, collective communications, and several MPI library functions. A hands-on exercise parallelizing a machine learning model application on an XSEDE supercomputing cluster will be used so participants can practice these concepts. Follow-up sessions will be offered to further help participants with the exercise.

You will learn:

The message-passing interface paradigm
Core components of an MPI message: body, envelope
MPI processes and communicators
Collective communications: broadcast messages and reductions

Pre-requisites:

Basic C/C++ or Fortran programming skills
Basic knowledge of parallel computing concepts
Familiarity with a remote Linux server text editor: vi, nano, or emacs

Illinois Campus Cluster: Basics of Access and Usage Workshop

September 7, 2022 | Presented by Bruno Abreu

The workshop consists of three sections:

A presentation introducing the ICC program and the ICC's technical and physical resources, capabilities, and organization (20 min)
Hands-on exercises teaching how to access the ICC, manage the environment and run jobs/software (55 min)
Q&A session (15 min)

Using GPUs with Python

September 14, 2022 | Presented by Kris Keipert (NVIDIA)

The NCSA Delta team and NVIDIA are hosting a workshop that will give you hands-on experience accelerating Python codes with NVIDIA GPUs. We will utilize code samples in three main categories to introduce you to Python GPU accelerated computing: drop-in replacements for SciPy and NumPy code through the CuPy library, GPU acceleration for end-to-end data science workloads using NVIDIA RAPIDS, and custom accelerated code without leaving the Python language using Numba.

Profiling Python Applications

September 22, 2022 | Presented by Bruno Abreu, NCSA

NCSA and Research IT will offer this highly interactive workshop, a 2-hour session introducing registrants to built-in Python profiling tools, one of the pillars of performance tuning and optimization. Hands-on exercises will explore time-based and memory-based profiling techniques of simple scripts using cProfile, pstats, snakeviz, line_profiler, and memory_profiler, highlighting the typical routes programmers can take to make their applications more efficient.

Distributed Deep Learning: NVIDIA Workshop

October 18, 2022 | Presented by Sri Koundinyan (NVIDIA)

The NCSA Delta team, the Center for Artificial Intelligence Innovation (CAII), and NVIDIA are hosting a workshop that will give you hands-on experience with Distributed Deep Learning techniques using NVIDIA GPUs. This workshop consisted of four two-hour sessions held on four consecutive Tuesdays.

Overview

Modern deep learning challenges leverage increasingly larger datasets and more complex models. As a result, significant computational power is required to train models effectively and efficiently. Learning to distribute data across multiple GPUs during deep learning model training makes possible an incredible wealth of new applications utilizing deep learning. Additionally, the effective use of systems with multiple GPUs reduces training time, allowing for faster application development and much faster iteration cycles. Teams who are able to perform training using multiple GPUs will have an edge, building models trained on more data in shorter periods of time and with greater engineer productivity. This workshop teaches you techniques for data-parallel deep learning training on multiple GPUs to shorten the training time required for data-intensive applications. Working with deep learning tools, frameworks, and workflows to perform neural network training, you’ll learn how to decrease model training time by distributing data to multiple GPUs, while retaining the accuracy of training on a single GPU.

Learning Objectives

By participating in this workshop, you’ll learn how to:

Understand how data parallel deep learning training is performed using multiple GPUs
Achieve maximum throughput when training, for the best use of multiple GPUs
Distribute training to multiple GPUs using Pytorch Distributed Data Parallel
Understand and utilize algorithmic considerations specific to multi-GPU training performance and accuracy

Prerequisites

To take the most out of this workshop, participants are expected to have experience with deep learning using Python. It will be helpful if you are comfortable with some or all of the concepts covered in the Fundamentals of Deep Learning course.


Title: title Presenter: presenter Description: description NCSA Resource: Dropdown list