Packt

Ultimate AWS Data Engineering Bootcamp - 15 Real-World Labs

Packt

Ultimate AWS Data Engineering Bootcamp - 15 Real-World Labs

访问权限由 New York State Department of Labor 提供

深入了解一个主题并学习基础知识。
中级 等级

推荐体验

1 周 完成
在 10 小时 一周
灵活的计划
自行安排学习进度
深入了解一个主题并学习基础知识。
中级 等级

推荐体验

1 周 完成
在 10 小时 一周
灵活的计划
自行安排学习进度

您将学到什么

  • Process and visualize real-time data using Kinesis, Spark Streaming, and Streamlit.

  • Automate workflow execution using ECS, Lambda, Step Functions, and GitHub Actions.

  • Build and manage lakehouses using Glue, S3, Athena, and Delta Lake architecture.

  • Design, deploy, and orchestrate AWS-native batch and real-time data pipelines.

您将获得的技能

要了解的详细信息

可分享的证书

添加到您的领英档案

作业

16 项作业

授课语言:英语(English)
最近已更新!

February 2026

了解顶级公司的员工如何掌握热门技能

Petrobras, TATA, Danone, Capgemini, P&G 和 L'Oreal 的徽标

该课程共有16个模块

In this module, we will set the foundation for your journey through AWS data engineering. You'll gain clarity on the course structure, explore the tech stack—including Docker, AWS CLI, and more—and ensure your local environment is ready for executing the real-world labs. This introduction is critical to align expectations and configure the tools required for success.

涵盖的内容

3个视频1篇阅读材料

In this module, we will implement a batch data processing project for music streaming data. You'll learn to use Airflow for orchestration and Redshift Serverless for storage and querying, culminating in a full pipeline execution. The focus is on understanding the interaction between orchestration tools and AWS services.

涵盖的内容

9个视频1个作业

In this module, we will process music stream data using a distributed system that combines PySpark and DynamoDB. You'll use Airflow to orchestrate the workflow and execute jobs using the AWS Glue Docker image locally. This project introduces scalable and parallel data processing techniques.

涵盖的内容

5个视频1个作业

In this module, we will build a robust ETL pipeline for rental apartment data. You will set up MySQL in AWS Aurora, use Glue for data transformation, and orchestrate the workflow using Step Functions and EventBridge. This lab emphasizes automation and modular pipeline execution.

涵盖的内容

9个视频1个作业

In this module, we will create a datalake for a rental vehicle store using scalable services like EMR and Athena. You'll execute PySpark both locally and on the cloud, integrate metadata using Glue crawlers, and automate the pipeline using Step Functions.

涵盖的内容

8个视频1个作业

In this module, we will develop an event-driven data pipeline tailored for an e-commerce application. You'll containerize Python apps, deploy them using ECS, and automate workflows using Step Functions and EventBridge. This lab blends DevOps and data pipeline principles.

涵盖的内容

7个视频1个作业

In this module, we will build a lakehouse architecture combining the flexibility of data lakes and the performance of data warehouses. You will use PySpark with Delta Lake, manage metadata with Glue Catalog, and query data through Athena and Redshift.

涵盖的内容

5个视频1个作业

In this module, we will implement real-time processing of taxi trip data using a serverless approach. You'll set up Kinesis streams, deploy Lambda functions, and execute a complete pipeline. This lab reinforces serverless computing and event-driven design.

涵盖的内容

5个视频1个作业

In this module, we will process mobile network logs using real-time technologies and deliver interactive insights via Streamlit. You'll build and deploy dashboards to ECS, leveraging Spark for streaming data and Glue Catalog for metadata management.

涵盖的内容

6个视频1个作业

In this module, we will set up CI/CD pipelines to automate deployment of AWS Glue jobs, ECS tasks, and Lambda functions using GitHub Actions. You'll learn how to build and manage version-controlled workflows for repeatable deployments.

涵盖的内容

5个视频1个作业

In this module, we will ingest real-time clickstream data using Kinesis Firehose and enrich it using Lambda before storing it in Redshift. You'll build a robust pipeline suitable for web analytics or behavioral tracking applications.

涵盖的内容

4个视频1个作业

In this module, we will challenge you to independently set up a MySQL database on AWS Aurora. This assignment reinforces database fundamentals and AWS RDS deployment skills.

涵盖的内容

2个视频1个作业

In this module, you will independently implement a lakehouse architecture for a commercial flights dataset. This assignment consolidates your understanding of data lakes, delta tables, and metadata integration with Glue.

涵盖的内容

4个视频1个作业

In this module, you'll build a real-time system that dynamically adjusts pricing for e-commerce users based on events. This assignment emphasizes practical business applications of event-driven data processing.

涵盖的内容

2个视频1个作业

In this module, you'll build a real-time streaming job to process Spotify metrics. This assignment helps you apply PySpark and AWS Glue in real-world streaming scenarios.

涵盖的内容

2个视频1个作业

In this final module, you'll implement CI/CD automation for Lambda functions using GitHub Actions. This assignment solidifies your DevOps knowledge and prepares you for real-world deployment automation.

涵盖的内容

2个视频2个作业

位教师

Packt - Course Instructors
Packt
1,534 门课程 407,061 名学生

提供方

Packt

人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

自 2018开始学习的学生
''能够按照自己的速度和节奏学习课程是一次很棒的经历。只要符合自己的时间表和心情,我就可以学习。'

Jennifer J.

自 2020开始学习的学生
''我直接将从课程中学到的概念和技能应用到一个令人兴奋的新工作项目中。'

Larry W.

自 2021开始学习的学生
''如果我的大学不提供我需要的主题课程,Coursera 便是最好的去处之一。'

Chaitanya A.

''学习不仅仅是在工作中做的更好:它远不止于此。Coursera 让我无限制地学习。'

从 Information Technology 浏览更多内容