Do you need any prerequisites before learning Spark performance tuning?

A basic understanding of Python and Spark DataFrames is helpful, and familiarity with JSON and SQL will make the material easier to follow. This is an intermediate course that assumes you can already work with Spark at a basic level and want to get better at diagnosing and tuning job execution.

What tools, platforms, or methods are used in this course?

The course centers on Apache Spark, especially the Spark UI for analyzing job behavior. The main methods are metrics-driven diagnosis and targeted tuning of data distribution and resource configuration.

Optimize Spark Performance & Throughput

本课程是多个项目的一部分。

位教师：Merna Elzahaby

包含在中

了解更多

3个模块

深入了解一个主题并学习基础知识。

中级等级

推荐体验

4 小时完成

灵活的计划

自行安排学习进度

3个模块

深入了解一个主题并学习基础知识。

中级等级

推荐体验

4 小时完成

灵活的计划

自行安排学习进度

您将学到什么

Inspect Spark UI and metrics (task duration, shuffle I/O, executor CPU/mem) to find bottlenecks and recommend actionable optimizations.
Apply partitioning and skew mitigation (salting/custom partitioner) & reduce shuffle (broadcast joins, avoid groupByKey, AQE) to improve parallelism.
Configure executors, cores, memory, dynamic allocation and parallelism/caching settings to maximize throughput while meeting defined SLA targets.

您将获得的技能

您将学习的工具

要了解的详细信息

可分享的证书

添加到您的领英档案

了解顶级公司的员工如何掌握热门技能

了解关于 Coursera for Business 的更多信息

Petrobras, TATA, Danone, Capgemini, P&G 和 L'Oreal 的徽标

积累特定领域的专业知识

此课程作为的一部分提供

在注册此课程时，您还需要选择一个特定的合作项目。

向行业专家学习新概念
获得对主题或工具的基础理解
通过实践项目培养工作相关技能
获得可共享的职业证书

该课程共有3个模块

In large-scale data engineering environments, performance issues such as slow transformations, excessive shuffle operations, and unbalanced workloads can impact analytics, reporting, and SLA commitments. This course teaches you how to analyze, diagnose, and optimize Apache Spark applications so they run faster, more efficiently, and more reliably. In this course, you’ll start by learning the fundamentals of Spark job execution, including how stages, tasks, shuffle operations, and execution plans reveal where bottlenecks occur. You’ll explore Spark’s built-in monitoring tools to interpret job behavior. From there, you’ll apply practical optimization techniques, including improving data partitioning, mitigating data skew, optimizing joins, configuring caching strategies, and choosing efficient file formats. You’ll also learn how to tune executors, memory, cores, and dynamic allocation to balance cost and performance across workloads.

Learners should be familiar with basic knowledge of Python and Spark DataFrames; familiarity with JSON and SQL. This course is designed for data engineers and developers who need to diagnose and optimize Spark jobs running on large-scale distributed data pipelines. By the end, you’ll have the skills to confidently apply advanced tuning strategies, improve throughput, reduce shuffle overhead, and optimize resource usage.

This module introduces learners to Spark’s job execution model and key performance metrics. Learners will explore the Spark UI, interpret job stages, tasks, and shuffle metrics, and diagnose performance bottlenecks using real job logs.

涵盖的内容

4个视频2篇阅读材料1次同伴评审

4个视频总计29分钟

Welcome & What You Will Learn3分钟
Understanding Spark Job Execution7分钟
Key Metrics for Diagnosing Bottlenecks7分钟
Case Demo: Using Spark UI to Spot Issues11分钟

2篇阅读材料总计10分钟

Welcome to the Course: Course Overview5分钟
Interpreting the Spark UI5分钟

1次同伴评审总计20分钟

Hands-On-Learning: Analyze a Spark Job Using the Spark UI20分钟

This module teaches learners how to solve the most common Spark bottlenecks: data skew, excessive shuffling, inefficient joins, and poor partitioning. Learners apply practical techniques such as salting, repartitioning, broadcast joins, and AQE.

涵盖的内容

3个视频1篇阅读材料1次同伴评审

This module focuses on configuring Spark resources—executors, CPU, memory, dynamic allocation, parallelism—and tuning job parameters to maximize throughput and meet strict performance SLAs.

涵盖的内容

4个视频1篇阅读材料1个作业2次同伴评审

4个视频总计31分钟

Understanding Executors, Cores & Memory7分钟
Dynamic Allocation & Parallelism Tuning8分钟
Case Demo: Tuning a Job to Meet SLA12分钟
Course Wrap-Up & Next Steps4分钟

1篇阅读材料总计5分钟

Best Practices for SLA-Focused Optimization5分钟

1个作业总计25分钟

Optimize Spark Performance & Throughput25分钟

2次同伴评审总计80分钟

Hands-On-Learning: Tune a Spark Job to Meet a Given SLA20分钟
Project: End-to-End Spark Job Optimization60分钟

获得职业证书

将此证书添加到您的 LinkedIn 个人资料、简历或履历中。在社交媒体和绩效考核中分享。

位教师

Merna Elzahaby

Coursera

1 门课程91 名学生

提供方

Coursera

从 Cloud Computing 浏览更多内容

Coursera
Optimize Spark Performance: Analyze & Accelerate
课程
Coursera
Fix Data Bottlenecks: Optimize Spark Performance
课程
Coursera
Optimizing Spark and Cloud Data Storage for Analytics
课程
Coursera
Spark, Skew & Speed: Pipeline Performance Engineering
专项课程

人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

自 2018开始学习的学生

''能够按照自己的速度和节奏学习课程是一次很棒的经历。只要符合自己的时间表和心情，我就可以学习。'

Jennifer J.

自 2020开始学习的学生

''我直接将从课程中学到的概念和技能应用到一个令人兴奋的新工作项目中。'

Larry W.

自 2021开始学习的学生

''如果我的大学不提供我需要的主题课程，Coursera 便是最好的去处之一。'

Chaitanya A.

''学习不仅仅是在工作中做的更好：它远不止于此。Coursera 让我无限制地学习。'

通过订阅解锁 10,000 多门课程的访问权限
通过在线学位推动您的职业生涯
获取世界一流大学的学位 - 100% 在线
加入全球超过 4,700 家选择 Coursera for Business 的公司

常见问题

Spark performance tuning in this course means analyzing how Apache Spark jobs actually run and making targeted changes so they execute more efficiently. The focus is on finding bottlenecks from execution behavior and then improving things like data distribution, shuffle handling, joins, caching, and resource settings.

You would use Spark performance tuning when a job is slower than expected, shows heavy shuffle activity, or has uneven task runtimes across the cluster. In this course, it is treated as a repeatable way to diagnose those patterns and choose changes that improve throughput and resource usage.

Spark performance tuning usually comes after a job or pipeline is already functionally correct and you need to understand how it behaves at runtime. It fits into the build-and-improve phase, where you inspect execution, adjust data layout or resources, and validate that the workload runs more efficiently.

General Spark development is about writing logic that produces the right result, while Spark performance tuning is about how that same logic is executed across jobs, stages, tasks, partitions, and executors. This course emphasizes runtime evidence and targeted optimization rather than stopping at code that is only functionally correct.

You’ll practice reading job, stage, task, and executor metrics, spotting bottlenecks such as data skew or expensive shuffle patterns, and deciding which optimizations to try. You’ll also work on balancing partitions, choosing join or caching strategies, tuning executors and parallelism settings, and checking whether those changes improve throughput and support SLA targets.