How is an orchestrated, recoverable pipeline different from running separate jobs manually?

Manual jobs mainly rely on separate reruns and human judgment, while an orchestrated, recoverable pipeline has defined dependencies, retries, and recovery paths. The course emphasizes coordinated execution and controlled recovery rather than ad hoc fixes after something breaks.

Do you need any prerequisites before learning pipeline orchestration and recovery?

A basic understanding of Python, SQL, the Linux command line, and Kafka fundamentals is helpful before starting this course. Because it is intermediate, it assumes you can follow how tasks, state, and data movement behave in a real pipeline.

What tools, platforms, or methods are used in this course?

The course uses modern workflow orchestrators such as Airflow and Prefect, along with recovery methods like checkpointing and dead-letter queues.

What specific tasks will you practice or complete in this course?

You practice building scheduled workflows with dependencies and retries, and using logs or alerts to investigate failures. You also work on recovery tasks such as restarting from checkpoints, handling bad records safely, and running controlled backfills or failover steps.

Orchestrate & Recover Real-Time Data Pipelines

本课程是 Real-Time, Real Fast: Kafka & Spark for Data Engineers 专项课程的一部分

位教师：Starweaver

包含在中

了解更多

3个模块

深入了解一个主题并学习基础知识。

中级等级

推荐体验

4 小时完成

灵活的计划

自行安排学习进度

3个模块

深入了解一个主题并学习基础知识。

中级等级

推荐体验

4 小时完成

灵活的计划

自行安排学习进度

您将学到什么

Build and schedule streaming and batch-adjacent workflows using a modern orchestrator, such as Airflow or Prefect.
IImplement reliability patterns like idempotence, checkpointing, DLQs, and backfills for fault-tolerant and exactly-once-ish processing.
Design multi-region recovery strategies (mirroring/replication) and run playbooks to restore pipelines after partial or regional failures.

您将获得的技能

您将学习的工具

要了解的详细信息

可分享的证书

添加到您的领英档案

了解顶级公司的员工如何掌握热门技能

了解关于 Coursera for Business 的更多信息

Petrobras, TATA, Danone, Capgemini, P&G 和 L'Oreal 的徽标

积累特定领域的专业知识

本课程是 Real-Time, Real Fast: Kafka & Spark for Data Engineers 专项课程专项课程的一部分

在注册此课程时，您还会同时注册此专项课程。

向行业专家学习新概念
获得对主题或工具的基础理解
通过实践项目培养工作相关技能
获得可共享的职业证书

该课程共有3个模块

Building a data pipeline is easy. Building one that automatically recovers from failures, maintains data integrity during outages, and runs reliably in production—that's what separates junior engineers from platform architects.

This course teaches you to design self-healing pipelines with automated recovery, fault tolerance, and disaster recovery built in from day one. You'll learn to build and schedule streaming workflows using modern orchestrators like Airflow and Prefect, implement reliability patterns including idempotence, checkpointing, and dead-letter queues for exactly-once-ish processing, and design multi-region recovery strategies that keep data flowing during regional failures. Through hands-on labs and real-world examples from Airbnb, LinkedIn, Netflix, and Uber, you'll master the orchestration and recovery techniques that turn fragile scripts into production-grade infrastructure. Learn to handle automated retries, run safe backfills, implement checkpoint-based recovery, and execute disaster recovery playbooks that restore pipelines after outages. Engineers who build or maintain real-time data pipelines and need stronger orchestration, reliability, and recovery skills. Basics of Python & SQL, Linux CLI, and Kafka fundamentals. Cloud account helpful but optional. By the end of the course, learners will be able to design, orchestrate, and recover real-time data pipelines that run reliably at production scale.

Learners set up a modern orchestrator and build a first DAG/flow that runs reliably. We cover scheduling, retries, task dependencies, and lightweight observability. By the end, learners will ship a minimal but production-aware pipeline.

涵盖的内容

4个视频2篇阅读材料1次同伴评审

4个视频总计31分钟

Why Orchestration Matters: From Cron to DAGs3分钟
Build Your First DAG (Airflow)9分钟
Flows the Pythonic Way (Prefect)9分钟
Demo: Scheduling, Retries, and Alerting End-to-End10分钟

2篇阅读材料总计10分钟

Welcome to the Course: Course Overview5分钟
Choosing an Orchestrator: Airflow vs. Prefect5分钟

1次同伴评审总计20分钟

Hands-On-Learning: Ship a Minimal Reliable DAG/Flow20分钟

We move from “works on my machine” to “recovers on its own.” Learners add exactly-once-ish processing, checkpointing, schema controls, and dead-letter queues. The module emphasizes designing for replay and safe backfills.

涵盖的内容

3个视频1篇阅读材料1次同伴评审

3个视频总计32分钟

Exactly-Once with Kafka: What You Really Get14分钟
Checkpointing & State: Replaying Without Duplicates8分钟
DLQs in Practice: From Error Handling to Triaging10分钟

1篇阅读材料总计5分钟

Checkpoints & WAL in Structured Streaming5分钟

1次同伴评审总计20分钟

Hands-On-Learning: Make a Stream Bulletproof: Checkpoints, DLQ, Idempotence20分钟

Learners design for failure domains—task, job, cluster, and region. We cover backfills vs. reprocessing, Delta time travel for safe fixes, and Kafka replication patterns (MirrorMaker 2, uReplicator) for DR.

涵盖的内容

4个视频2篇阅读材料1个作业2次同伴评审

4个视频总计34分钟

Backfills & Reprocessing Without Breaking SLAs10分钟
Time Travel & Audits with Delta Tables8分钟
Cross-Region Kafka Replication (MM2/uReplicator)11分钟
Your Recovery Posture, Summarized4分钟

2篇阅读材料总计10分钟

Choosing a Replication Strategy: MM2 vs. uReplicator5分钟
Additional Resource5分钟

1个作业总计20分钟

Orchestrate & Recover Real-Time Data Pipelines20分钟

2次同伴评审总计80分钟

Hands-On-Learning: DR Fire Drill: Cross-Region Failover & Targeted Backfill20分钟
Project: Orchestrate & Recover a Real-Time Pipeline60分钟

获得职业证书

将此证书添加到您的 LinkedIn 个人资料、简历或履历中。在社交媒体和绩效考核中分享。

位教师

Starweaver

Coursera

561 门课程1,118,027 名学生

提供方

Coursera

从 Security 浏览更多内容

Coursera
Build & Transform Data Pipelines
课程
Edureka
Data Engineering Workflow Orchestration with Airflow
课程
Coursera
Orchestrate, Analyze, and Evaluate ML Pipelines
课程
Coursera
Building Automated Data Pipelines with Spark,dbt,and Airflow
课程

人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

自 2018开始学习的学生

''能够按照自己的速度和节奏学习课程是一次很棒的经历。只要符合自己的时间表和心情，我就可以学习。'

Jennifer J.

自 2020开始学习的学生

''我直接将从课程中学到的概念和技能应用到一个令人兴奋的新工作项目中。'

Larry W.

自 2021开始学习的学生

''如果我的大学不提供我需要的主题课程，Coursera 便是最好的去处之一。'

Chaitanya A.

''学习不仅仅是在工作中做的更好：它远不止于此。Coursera 让我无限制地学习。'

通过订阅解锁 10,000 多门课程的访问权限
通过在线学位推动您的职业生涯
获取世界一流大学的学位 - 100% 在线
加入全球超过 4,700 家选择 Coursera for Business 的公司

常见问题

It means designing a real-time data pipeline as a coordinated workflow that can schedule work, manage dependencies, and recover cleanly when something fails. The course focuses on making pipelines reliable over time, not just getting a script or job to run once.

You would use it when a pipeline needs to run repeatedly, stay observable, and keep data moving even when tasks fail, records are bad, or a dependency becomes unstable. In this course, it is used for real-time and batch-adjacent workflows that need safe retries, replays, and recovery paths.

It sits between writing the logic for individual pipeline steps and running the whole system reliably over time. In this course, that layer turns separate tasks into a repeatable process you can schedule, monitor, backfill, and restore.