This course is targeted to scientists, engineers, scholars, or anyone seeking to solve problems efficiently in high-performance computing environments or in the cloud. Students completing this course will have a basic understanding of how to find bottlenecks in their programs as well as how to address those bottlenecks. The course will provide a high-level introduction to modern compute node architectures of high-performance and cloud computing instances.


您将学到什么
Describe the computing and memory architecture of a supercomputing node or cloud computing instance
Utilize compiler and libraries to increase the performance of your program
Understand how to utilize vector operations of a modern microprocessor to maximize performance
Use OpenMP directives to improve vectorization of your programs
要了解的详细信息

添加到您的领英档案
August 2025
5 项作业
了解顶级公司的员工如何掌握热门技能

积累特定领域的专业知识
- 向行业专家学习新概念
- 获得对主题或工具的基础理解
- 通过实践项目培养工作相关技能
- 获得可共享的职业证书

该课程共有5个模块
In this module, we cover an approach to analyze and optimize program performance, such as profiling, using optimized libraries, and compiler options for increasing efficiency.
涵盖的内容
5个视频2篇阅读材料1个作业1个编程作业
In this module, we examine simple techniques that help with program performance. We are looking at scalar and loop optimization methods that can have a large impact on a program’s floating-point performance.
涵盖的内容
5个视频1个作业1个编程作业
In this module, we introduce the basic architecture of modern computers focusing on how the architecture influences program performance. We are looking at processor level data parallelism and how optimized code for parallelism has a much increased floating-point performance.
涵盖的内容
4个视频1个作业1个编程作业
Memory performance is generally the main performance bottleneck since the speed of the main memory has not kept up with the capabilities of processors to process floating-point numbers. We introduce how layers of fast memory, called cache memory, can speed up computations and provide an example of how to optimize algorithms for better memory performance.
涵盖的内容
4个视频1个作业1个编程作业
This module will provide an introduction to parallel and high throughput computing. It will also demonstrate slurm job arrays, where there are mechanisms for working with many similar jobs quickly and easily. Finally, this module will look at running many jobs concurrently with GNU Parallel.
涵盖的内容
4个视频1个作业1个编程作业
获得职业证书
将此证书添加到您的 LinkedIn 个人资料、简历或履历中。在社交媒体和绩效考核中分享。
位教师


从 Computer Security and Networks 浏览更多内容
- 状态:免费试用
University of Colorado Boulder
- 状态:免费试用
École Polytechnique Fédérale de Lausanne
- 状态:免费试用
University of Colorado Boulder
- 状态:免费试用
University of Colorado Boulder
人们为什么选择 Coursera 来帮助自己实现职业发展




常见问题
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.
更多问题
提供助学金,