This course is targeted to scientists, engineers, scholars, or anyone seeking to solve problems efficiently in high-performance computing environments or in the cloud. Students completing this course will have a basic understanding of how to find bottlenecks in their programs as well as how to address those bottlenecks. The course will provide a high-level introduction to modern compute node architectures of high-performance and cloud computing instances.

您将学到什么
Describe the computing and memory architecture of a supercomputing node or cloud computing instance
Utilize compiler and libraries to increase the performance of your program
Understand how to utilize vector operations of a modern microprocessor to maximize performance
Use OpenMP directives to improve vectorization of your programs
要了解的详细信息

添加到您的领英档案
5 项作业
了解顶级公司的员工如何掌握热门技能

积累特定领域的专业知识
- 向行业专家学习新概念
- 获得对主题或工具的基础理解
- 通过实践项目培养工作相关技能
- 获得可共享的职业证书

该课程共有5个模块
In this module, we cover an approach to analyze and optimize program performance, such as profiling, using optimized libraries, and compiler options for increasing efficiency.
涵盖的内容
5个视频3篇阅读材料1个作业1个编程作业
In this module, we examine simple techniques that help with program performance. We are looking at scalar and loop optimization methods that can have a large impact on a program’s floating-point performance.
涵盖的内容
5个视频1个作业1个编程作业
In this module, we introduce the basic architecture of modern computers focusing on how the architecture influences program performance. We are looking at processor level data parallelism and how optimized code for parallelism has a much increased floating-point performance.
涵盖的内容
4个视频1个作业1个编程作业
Memory performance is generally the main performance bottleneck since the speed of the main memory has not kept up with the capabilities of processors to process floating-point numbers. We introduce how layers of fast memory, called cache memory, can speed up computations and provide an example of how to optimize algorithms for better memory performance.
涵盖的内容
4个视频1个作业1个编程作业
This module will provide an introduction to parallel and high throughput computing. It will also demonstrate slurm job arrays, where there are mechanisms for working with many similar jobs quickly and easily. Finally, this module will look at running many jobs concurrently with GNU Parallel.
涵盖的内容
4个视频1个作业1个编程作业
获得职业证书
将此证书添加到您的 LinkedIn 个人资料、简历或履历中。在社交媒体和绩效考核中分享。
位教师


人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

Jennifer J.

Larry W.

Chaitanya A.
从 Computer Science 浏览更多内容

University of Colorado Boulder

Fred Hutchinson Cancer Center

University of Colorado Boulder



