The course "Multicore and GPGPU Programming" provides a foundational understanding of parallel programming, focusing on developing high-performance, multi-threaded applications in both CPU and GPU environments. Beginning with a review of multicore processor architectures, caching mechanisms, and Non-Uniform Memory Access (NUMA) systems, students will learn the essentials of shared memory programming, synchronisation techniques, and the use of locks to ensure data integrity across threads.
通过 Coursera Plus 提高技能,仅需 239 美元/年(原价 399 美元)。立即节省

Multicore and GPGPU Programming


授课教师


包含在 中
您将学到什么
Understand the fundamentals of multi-threaded programming and its applications in multicore systems.
Develop shared memory programs in OpenMP and distributed programming using MPI.
Gain a foundational understanding of GPGPU architecture and the CUDA programming model.
您将学习的工具
要了解的详细信息

添加到您的领英档案
124 项作业
了解顶级公司的员工如何掌握热门技能

该课程共有12个模块
In this module, the learners will be introduced to the course and its syllabus, setting the foundation for their learning journey. The course's introductory video will provide them with insights into the valuable skills and knowledge they can expect to gain throughout the duration of this course. Additionally, the syllabus reading will comprehensively outline essential course components, including course values, assessment criteria, grading system, schedule, details of live sessions, and a recommended reading list that will enhance the learner’s understanding of the course concepts. Moreover, this module offers the learners the opportunity to connect with fellow learners as they participate in a discussion prompt designed to facilitate introductions and exchanges within the course community.
涵盖的内容
4个视频1篇阅读材料1个讨论话题
4个视频• 总计51分钟
- Course Introductory Video• 2分钟
- Meet Your Instructor - Dr. Gargi Prabhu • 1分钟
- Meet Your Instructor - Dr. Kunal Korgaonkar• 1分钟
- Recording of Multicore and GPGPU Programming: Week 1 - Live Session on 25-05-23 18:32:50 [47:25]• 47分钟
1篇阅读材料• 总计10分钟
- Course Overview• 10分钟
1个讨论话题• 总计10分钟
- Meet Your Peers• 10分钟
In this module, students will gain foundational knowledge of parallel and multi-threaded programming, exploring the core principles that underlie the efficient utilisation of modern multi-core and many-core processors. Beginning with an overview of parallel programming concepts, this module covers different types of parallelism, including data parallelism, task parallelism, and pipeline parallelism. Students will also examine critical performance metrics like speedup, efficiency, and scalability, which help in evaluating the benefits and trade-offs of parallel approaches.
涵盖的内容
12个视频2篇阅读材料12个作业1个讨论话题
12个视频• 总计73分钟
- Need for Ever-Increasing Performance• 8分钟
- Parallel Systems and Parallel Programs• 8分钟
- Concurrent, Parallel, Distributed Systems• 5分钟
- Types of Parallelism: Data, Task and Pipeline Parallelism• 8分钟
- Speedup and Efficiency• 5分钟
- Amdahl’s Law • 5分钟
- Gustafson’s Law • 5分钟
- Scalability in Parallel Systems• 5分钟
- Cost of Parallelisation• 7分钟
- Sources of Overhead in Parallel Programs • 5分钟
- Timing Parallel Programs: Methods and Best Practices• 7分钟
- GPU Performance• 5分钟
2篇阅读材料• 总计120分钟
- Recommended Reading: Fundamentals of Parallel Computing• 60分钟
- Recommended Reading: Introduction to Performance Metrics in Parallel Computing• 60分钟
12个作业• 总计36分钟
- Need for Ever-Increasing Performance• 3分钟
- Parallel Systems and Parallel Programs• 3分钟
- Concurrent, Parallel, Distributed Systems• 3分钟
- Types of Parallelism: Data, Task and Pipeline Parallelism• 3分钟
- Speedup and Efficiency• 3分钟
- Amdahl’s Law • 3分钟
- Gustafson’s Law • 3分钟
- Scalability in MIMD Systems• 3分钟
- Cost of Parallelisation• 3分钟
- Sources of Overhead in Parallel Programs• 3分钟
- Taking Timings of Parallel Programs• 3分钟
- GPU Performance• 3分钟
1个讨论话题• 总计30分钟
- Why Parallelism? Revisiting the Roots of Multicore Programming• 30分钟
This module provides an in-depth exploration of multicore processor architectures, examining the design principles, performance considerations, and challenges involved in building efficient multicore systems. Students will study how multiple cores interact within a processor, focusing on memory hierarchies, caching mechanisms, and the role of parallelism in improving computational performance.
涵盖的内容
15个视频2篇阅读材料15个作业1个讨论话题
15个视频• 总计160分钟
- The Von Neumann Architecture• 7分钟
- Processes, Multitasking, and Threads• 5分钟
- The Basics of Caching• 7分钟
- Virtual Memory• 7分钟
- Instruction-Level Parallelism• 9分钟
- Hardware Multithreading• 6分钟
- Classifications of Parallel Computers• 6分钟
- SIMD and MIMD Systems• 7分钟
- Interconnection Networks: Shared Memory Systems• 6分钟
- Interconnection Networks: Distributed Memory Systems• 8分钟
- Cache Coherence• 8分钟
- Shared-Memory vs. Distributed-Memory• 4分钟
- Parallel Software: Coordinating Process and Threads• 11分钟
- Distributed Memory Software• 7分钟
- Recording of Multicore and GPGPU Programming: Week 2 - Live Session on 25-05-30 18:35:08 [02:05]• 62分钟
2篇阅读材料• 总计100分钟
- Recommended Reading: Architecture Background• 40分钟
- Recommended Reading: Parallel Hardware and Software• 60分钟
15个作业• 总计114分钟
- The Von Neumann Architecture• 3分钟
- Processes, Multitasking, and Threads• 3分钟
- The Basics of Caching• 3分钟
- Virtual Memory• 3分钟
- Instruction-Level Parallelism• 3分钟
- Hardware Multithreading• 3分钟
- Classifications of Parallel Computer• 3分钟
- SIMD and MIMD Systems• 3分钟
- Interconnection Networks: Shared Memory Systems• 3分钟
- Interconnection Networks: Distributed Memory Systems• 6分钟
- Cache Coherence• 3分钟
- Shared-Memory vs. Distributed-Memory• 3分钟
- Parallel Software: Coordinating Process and Threads• 12分钟
- Distributed Memory Software• 3分钟
- Graded Quiz - Modules 1 and 2 • 60分钟
1个讨论话题• 总计30分钟
- From Von Neumann to Multicore: Evolving Architectures and Memory Realities• 30分钟
This module introduces students to the architectural principles of General-Purpose GPU (GPGPU) systems and the CUDA programming model. It explores the hardware components, including Streaming Multiprocessors (SMs), CUDA cores, and memory hierarchy, which form the foundation of GPU computing. The module also provides an overview of the CUDA programming model, emphasising its thread hierarchy, grid, and block organisation. By understanding these fundamental concepts, students will develop the ability to harness GPU architecture for high-performance parallel computing.
涵盖的内容
15个视频2篇阅读材料14个作业1个讨论话题
15个视频• 总计127分钟
- GPUs and GPGPU• 5分钟
- GPU Architecture• 5分钟
- Heterogeneous Computing• 4分钟
- Paradigm of Heterogeneous Computing• 5分钟
- Introduction to CUDA• 5分钟
- Structure of a CUDA Program• 8分钟
- Threads, Blocks, and Grid• 9分钟
- Managing Memory• 7分钟
- Writing and Verifying Your Kernel• 6分钟
- Compiling and Running CUDA Program• 4分钟
- Nvidia Compute Capabilities and Device Architecture• 6分钟
- Timing Your Kernel• 7分钟
- Organising Parallel Threads• 5分钟
- Managing Devices• 4分钟
- Recording of Multicore and GPGPU Programming: Week 3 - Live Session on 25-06-06 18:31:21 [44:50]• 45分钟
2篇阅读材料• 总计75分钟
- Recommended Reading: GPGPU Architecture and CUDA• 15分钟
- Recommended Reading: Programming Model Overview• 60分钟
14个作业• 总计48分钟
- GPUs and GPGPU• 6分钟
- GPU Architecture• 3分钟
- Heterogeneous Computing• 3分钟
- Paradigm of Heterogeneous Computing• 3分钟
- Introduction to CUDA• 3分钟
- Structure of a CUDA Program• 3分钟
- Threads, Blocks, and Grid• 6分钟
- Managing Memory• 3分钟
- Writing and Verifying Your Kernel• 3分钟
- Compiling and Running CUDA Program• 3分钟
- Nvidia Compute Capabilities and Device Architecture• 3分钟
- Timing Your Kernel• 3分钟
- Organising Parallel Threads• 3分钟
- Managing Devices• 3分钟
1个讨论话题• 总计30分钟
- Harnessing GPU Power: Exploring CUDA and the Architecture of Parallelism• 30分钟
This module provides a comprehensive understanding of how CUDA executes programs on GPUs. It covers key concepts such as warps, warp scheduling, and resource partitioning, which are critical for understanding GPU hardware behaviour. The module delves into branch divergence and its impact on performance, offering strategies to minimise its effects. It also emphasises exposing parallelism effectively by leveraging CUDA’s hierarchical execution model. Students will learn how to design and optimise GPU programs by aligning with the underlying execution model to maximise efficiency and throughput.
涵盖的内容
15个视频2篇阅读材料15个作业1个讨论话题
15个视频• 总计135分钟
- Introduction to CUDA Execution Model• 7分钟
- Warps and Thread Blocks• 4分钟
- Warp Divergence• 9分钟
- Resource Partitioning• 6分钟
- Latency Hiding• 10分钟
- Occupancy• 5分钟
- Synchronization• 4分钟
- Scalability• 5分钟
- Exposing Parallelism• 10分钟
- Checking Active Warps with Nvprof• 6分钟
- Checking Memory Operations with Nvprof• 7分钟
- Avoiding Branch Divergence• 3分钟
- The Parallel Reduction Problem and Thread Divergence• 7分钟
- Improving Divergence in Parallel Reduction• 6分钟
- Recording of Multicore and GPGPU Programming: Week 4 - Live Session on 25-06-13 18:32:39 [49:37]• 45分钟
2篇阅读材料• 总计120分钟
- Recommended Reading: Structure of a CUDA Program• 60分钟
- Recommended Reading: Exposing Parallelism and Avoiding Branch Divergence• 60分钟
15个作业• 总计105分钟
- Introduction to CUDA Execution Model• 3分钟
- Warps and Thread Blocks • 3分钟
- Warp Divergence• 3分钟
- Resource Partitioning• 6分钟
- Latency Hiding• 3分钟
- Occupancy• 3分钟
- Synchronization• 3分钟
- Scalability• 3分钟
- Exposing Parallelism• 3分钟
- Checking Active Warps with Nvprof• 3分钟
- Checking Memory Operations with Nvprof• 3分钟
- Avoiding Branch Divergence• 3分钟
- The Parallel Reduction Problem and Thread Divergence• 3分钟
- Improving Divergence in Parallel Reduction• 3分钟
- Graded Quiz - Modules 3 and 4 • 60分钟
1个讨论话题• 总计30分钟
- Under the Hood: Warps, Divergence, and CUDA Execution Dynamics• 30分钟
The CUDA Memory Model & Streams and Concurrency module introduces students to the intricacies of memory hierarchy in CUDA, including global, shared, and local memory. It emphasises the importance of memory coalescing and efficient memory access patterns to optimise performance on GPUs. The module also covers CUDA streams, explaining how concurrent kernel execution and memory operations can be managed to enhance parallelism. By understanding these concepts, students will gain the ability to design GPU programs that maximise throughput and minimise latency.
涵盖的内容
14个视频2篇阅读材料14个作业1个讨论话题1个非评分实验室
14个视频• 总计126分钟
- Introduction to CUDA Memory Model• 8分钟
- Memory Allocation and Deallocation• 6分钟
- Zero Copy Memory• 4分钟
- Unified Virtual Addressing and Unified Memory • 3分钟
- Aligned and Coalesced Access• 6分钟
- CUDA Shared Memory• 6分钟
- Shared Memory Banks and Access Mode • 7分钟
- Configuring the Amount of Shared Memory• 5分钟
- Synchronisation• 9分钟
- CUDA Streams• 7分钟
- Stream Scheduling and Priorities• 6分钟
- CUDA Events• 6分钟
- Concurrent Kernel Execution• 6分钟
- Recording of Multicore and GPGPU Programming: Week 5 - Live Session on 25-06-20 18:31:59 [47:36]• 48分钟
2篇阅读材料• 总计120分钟
- Recommended Reading: CUDA Memory Model• 60分钟
- Recommended Reading: Streams and Concurrency• 60分钟
14个作业• 总计342分钟
- Introduction to CUDA Memory Model• 3分钟
- Memory Allocation and Deallocation• 3分钟
- Zero Copy Memory• 3分钟
- Unified Virtual Addressing and Unified Memory • 3分钟
- Aligned and Coalesced Access• 3分钟
- CUDA Shared Memory• 6分钟
- Shared Memory Banks and Access Mode • 3分钟
- Configuring the Amount of Shared Memory• 3分钟
- Synchronisation• 3分钟
- CUDA Streams• 3分钟
- Stream Scheduling and Priorities• 3分钟
- CUDA Events• 3分钟
- Concurrent Kernel Execution• 3分钟
- SGA-1: CUDA Programming and Performance Optimisation• 300分钟
1个讨论话题• 总计30分钟
- Smart Memory and Seamless Concurrency: CUDA Memory and Streams• 30分钟
1个非评分实验室• 总计60分钟
- Hands on lab: Parallel Matrix Addition Using CUDA• 60分钟
This module explains in depth the difference between processes and threads and introduces multithreaded programming using pthreads library. Students are expected to learn about the various functions in pthreads library and implement those to solve real-world problems through a multithreaded approach. It also discusses precautions to take while developing an algorithm that uses multi-threading.
涵盖的内容
10个视频11篇阅读材料10个作业1个讨论话题
10个视频• 总计116分钟
- Processes, Threads and Pthreads• 4分钟
- Hello World!!• 9分钟
- Matrix-Vector Multiplication• 13分钟
- Critical Sections• 5分钟
- Busy Waiting• 6分钟
- Mutexes• 5分钟
- Semaphores• 7分钟
- Barriers and Condition Variables• 13分钟
- Caches, Cache-Coherence and False Sharing• 9分钟
- Recording of Multicore and GPGPU Programming: Week 6 - Live Session on 25-06-27 18:38:36 [43:53]• 44分钟
11篇阅读材料• 总计295分钟
- Recommended Reading: Processes, Threads and Pthreads• 10分钟
- Recommended Reading: Hello World!!• 60分钟
- Recommended Reading: Matrix-Vector Multiplication• 15分钟
- Recommended Reading: Critical Sections• 30分钟
- Recommended Reading: Busy Waiting• 20分钟
- Recommended Reading: Mutexes• 15分钟
- Recommended Reading: Semaphores• 30分钟
- Recommended Reading: Barriers and Condition Variables• 30分钟
- Recommended Reading: Read-Write Locks• 60分钟
- Recommended Reading: Caches, Cache-Coherence and False Sharing• 15分钟
- Lab Instruction Document• 10分钟
10个作业• 总计135分钟
- Processes, Threads and Pthreads• 9分钟
- Hello World!!• 9分钟
- Matrix-Vector Multiplication• 9分钟
- Critical Sections• 9分钟
- Busy Waiting• 9分钟
- Mutexes• 9分钟
- Semaphores• 6分钟
- Barriers and Condition Variables• 6分钟
- Caches, Cache-Coherence and False Sharing• 9分钟
- Graded Quiz - Modules 5 and 6 • 60分钟
1个讨论话题• 总计10分钟
- Thread Synchronization and Shared Memory: Building Reliable Parallel Programs with Pthreads• 10分钟
This module aims to introduce students to Distributed memory programming using the Message Passing Interface (MPI). Students will learn about the functions provided by the MPI library and their descriptions. It will enable students to develop parallel programming codes and also to convert a serial programmed code into a parallel code with the help of the MPI functions.
涵盖的内容
7个视频9篇阅读材料7个作业1个讨论话题
7个视频• 总计70分钟
- Introduction to MPI• 4分钟
- MPI Setup and Communicator Functions• 6分钟
- SPMD and Communication• 10分钟
- Potential Pitfalls• 4分钟
- Simple Serial Sorting Algorithm• 20分钟
- Parallel Odd-Even Transposition Sort• 19分钟
- Safety in MPI Programs• 7分钟
9篇阅读材料• 总计125分钟
- Recommended Reading: Introduction to MPI• 15分钟
- Recommended Reading: MPI Setup and Communicator Functions• 15分钟
- Recommended Reading: SPMD and Communication• 15分钟
- Recommended Reading: Potential Pitfalls• 15分钟
- Recommended Reading: Simple Serial Sorting Algorithm• 15分钟
- Recommended Reading: Parallel Odd-Even Transposition Sort• 15分钟
- Recommended Reading: Safety in MPI Programs • 15分钟
- Lab: Practice Code• 10分钟
- Lab: Practice Solution• 10分钟
7个作业• 总计63分钟
- Introduction to MPI• 9分钟
- MPI Setup and Communicator Functions• 9分钟
- SPMD and Communication• 9分钟
- Potential Pitfalls• 9分钟
- Simple Serial Sorting Algorithm• 9分钟
- Parallel Odd-Even Transposition Sort• 9分钟
- Safety in MPI Programs• 9分钟
1个讨论话题• 总计30分钟
- MPI in Action: Understanding Setup, Communication, and Parallel Sorting• 30分钟
This module aims to introduce the shared memory programming model with the help of the OpenMP library. Students will gain exposure to the functions in the OpenMP library and methods to implement those in code to implement parallelism using shared memory. Students will explore the foundational concepts of OpenMP through videos and readings, starting with the basics of the library and progressing to more advanced topics such as reduction clauses, variable scoping, and mutual exclusion. Through worked examples like the Trapezoidal Rule and sorting functions, learners will understand how to parallelise loops, manage scheduling, and apply critical sections and locks for safe concurrent execution. The module also covers tasking in OpenMP and classic concurrency problems like producers and consumers.
涵盖的内容
12个视频12篇阅读材料13个作业1个讨论话题
12个视频• 总计94分钟
- Introduction to OpenMP• 5分钟
- Programming in OpenMP• 10分钟
- Trapezoidal Rule• 10分钟
- Scope of Variables• 4分钟
- Reduction Clause• 7分钟
- Parallel-For Directive and Caveats in Them• 8分钟
- Sorting Functions• 20分钟
- Scheduling• 6分钟
- Producers and Consumers• 6分钟
- Termination, Startup and Atomic Directive• 7分钟
- Critical Sections and Locks• 6分钟
- Tasking• 5分钟
12篇阅读材料• 总计152分钟
- Recommended Reading: Introduction to OpenMP• 15分钟
- Recommended Reading: Programming in OpenMP• 15分钟
- Recommended Reading: Trapezoidal Rule• 15分钟
- Recommended Reading: Scope of Variables• 15分钟
- Recommended Reading: Reduction Clause• 15分钟
- Recommended Reading: Parallel-For Directive and Caveats in Them• 15分钟
- Recommended Reading: Sorting Functions• 15分钟
- Recommended Reading: Scheduling • 15分钟
- Recommended Reading: Producers and Consumers• 15分钟
- Recommended Reading: Termination, Startup and Atomic Directive• 1分钟
- Recommended Reading: Critical Sections and Locks• 1分钟
- Recommended Reading: Tasking• 15分钟
13个作业• 总计168分钟
- Introduction to OpenMP• 9分钟
- Programming in OpenMP• 9分钟
- Trapezoidal Rule• 9分钟
- Scope of Variables• 9分钟
- Reduction Clause• 9分钟
- Parallel-For Directive and Caveats in Them• 9分钟
- Sorting Functions• 9分钟
- Scheduling• 9分钟
- Producers and Consumers• 9分钟
- Termination, Startup and Atomic Directive• 9分钟
- Critical Sections and Locks• 9分钟
- Tasking• 9分钟
- Graded Quiz - Modules 7 and 8• 60分钟
1个讨论话题• 总计30分钟
- Mastering OpenMP: From Parallel Patterns to Synchronisation• 30分钟
This module will introduce the n-body problem in physics, examining its significance in simulating gravitational interactions among multiple particles. It will explore classical and modern algorithmic approaches to solving the n-body problem, followed by a discussion on their computational complexity. Emphasis will be placed on identifying opportunities for parallelisation, and students will analyse and implement efficient parallel solutions using the programming languages and parallel computing directives covered in the course.
涵盖的内容
13个视频13篇阅读材料13个作业1个讨论话题
13个视频• 总计107分钟
- Introduction to N-body Problem• 8分钟
- Serial Solutions to the N-body Problem• 16分钟
- Parallelising Strategy• 13分钟
- Parallelising Basic Solver Using OpenMP• 9分钟
- Parallelising Reduced Solver Using OpenMP • 11分钟
- Evaluating OpenMP Performance• 5分钟
- Parallelising Basic Solver Using Pthreads • 4分钟
- Parallelising Basic Solver Using MPI • 9分钟
- Parallelising Reduced Solver Using MPI• 9分钟
- Evaluating MPI Performance• 6分钟
- Parallelising Basic Solver Using CUDA• 7分钟
- Evaluating CUDA Solver and Improving Performance• 4分钟
- Using Shared Memory for Solvers• 7分钟
13篇阅读材料• 总计195分钟
- Recommended Reading: Introduction to N-body Problem• 15分钟
- Recommended Reading: Serial Solutions to the N-body Problem• 15分钟
- Recommended Reading: Parallelising Strategy• 15分钟
- Recommended Reading: Parallelising Basic Solver Using OpenMP• 15分钟
- Recommended Reading: Parallelising Reduced Solver Using OpenMP• 15分钟
- Recommended Reading: Evaluating OpenMP performance• 15分钟
- Recommended Reading: Parallelising Basic Solver Using Pthreads• 15分钟
- Recommended Reading: Parallelising Basic Solver Using MPI• 15分钟
- Recommended Reading: Parallelising Reduced Solver Using MPI• 15分钟
- Recommended Reading: Evaluating MPI Performance• 15分钟
- Recommended Reading: Parallelising Basic Solver Using CUDA• 15分钟
- Recommended Reading: Evaluating CUDA Solver and Improving Performance• 15分钟
- Recommended Reading: Using Shared Memory for Solvers• 15分钟
13个作业• 总计138分钟
- Introduction to N-body Problem• 9分钟
- Serial Solutions to the N-body Problem• 9分钟
- Parallelising Strategy• 9分钟
- Parallelising Basic Solver Using OpenMP• 9分钟
- Parallelising Reduced Solver Using OpenMP• 9分钟
- Evaluating OpenMP Performance• 9分钟
- Parallelising Basic Solver Using Pthreads• 9分钟
- Parallelising Basic Solver Using MPI• 30分钟
- Parallelising Reduced Solver Using MPI• 9分钟
- Evaluating MPI Performance• 9分钟
- Parallelising Basic Solver Using CUDA• 9分钟
- Evaluating CUDA Solver and Improving Performance• 9分钟
- Using Shared Memory for Solvers• 9分钟
1个讨论话题• 总计30分钟
- The N-Body Solver: Exploring Parallelism Across Models• 30分钟
This module focuses on hands-on implementations of the Sample Sort algorithm using OpenMP, Pthreads, MPI, and CUDA. Students will explore the strengths and limitations of each parallel programming model through practical coding exercises. The module includes performance benchmarking and comparative analysis of the implementations to highlight trade-offs in scalability, efficiency, and suitability for different architectures. By the end of the module, students will have a strong grasp of each API and be equipped to make informed decisions about the most appropriate tool for a given parallel computing task.
涵盖的内容
8个视频9篇阅读材料10个作业1个讨论话题
8个视频• 总计61分钟
- Sample Sort and Bucket Sort• 10分钟
- Map• 17分钟
- Implementing Sample Sort Using OpenMP: First Implementation• 5分钟
- Implementing Sample Sort Using OpenMP: Second Implementation• 7分钟
- Implementing Sample Sort Using Pthreads • 4分钟
- Implementing Sample Sort Using MPI• 6分钟
- Implementing Sample Sort Using MPI: Example• 5分钟
- Implementing Sample Sort Using CUDA • 7分钟
9篇阅读材料• 总计115分钟
- Recommended Reading: Sample Sort and Bucket Sort• 15分钟
- Recommended Reading: Map• 10分钟
- Recommended Reading: Implementing Sample Sort Using OpenMP: First Implementation• 15分钟
- Recommended Reading: Implementing Sample Sort Using OpenMP: Second Implementation• 15分钟
- Recommended Reading: Implementing Sample Sort Using Pthreads• 10分钟
- Recommended Reading: Implementing Sample Sort Using MPI• 15分钟
- Recommended Reading: Implementing Sample Sort Using MPI: Example• 15分钟
- Recommended Reading: Implementing Sample Sort Using CUDA• 10分钟
- Recommended Reading: Which API?• 10分钟
10个作业• 总计432分钟
- Sample Sort and Bucket Sort• 9分钟
- Map (Quiz)• 9分钟
- Implementing Sample Sort Using OpenMP: First Implementation• 9分钟
- Implementing Sample Sort Using OpenMP: Second Implementation• 9分钟
- Implementing Sample Sort Using Pthreads• 9分钟
- Implementing Sample Sort Using MPI• 9分钟
- Implementing Sample Sort Using MPI: Example• 9分钟
- Implementing Sample Sort Using CUDA• 9分钟
- Graded Quiz - Modules 9 and 10• 60分钟
- SGA-2: Odd-Even Transposition Sort Parallelisation • 300分钟
1个讨论话题• 总计30分钟
- Parallel Sample Sort Across Platforms• 30分钟
Final Comprehensive Examination
涵盖的内容
1个作业
1个作业• 总计30分钟
- Final Comprehensive Examination • 30分钟
位教师


提供方

提供方

Birla Institute of Technology & Science, Pilani (BITS Pilani) is one of only ten private universities in India to be recognised as an Institute of Eminence by the Ministry of Human Resource Development, Government of India. It has been consistently ranked high by both governmental and private ranking agencies for its innovative processes and capabilities that have enabled it to impart quality education and emerge as the best private science and engineering institute in India. BITS Pilani has four international campuses in Pilani, Goa, Hyderabad, and Dubai, and has been offering bachelor's, master’s, and certificate programmes for over 58 years, helping to launch the careers for over 1,00,000 professionals.
人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

Jennifer J.

Larry W.

Chaitanya A.
常见问题
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.
更多问题
提供助学金,

