When will I receive my Course Certificate?

If you complete the course successfully, your electronic Course Certificate will be added to your Accomplishments page - from there, you can print your Course Certificate or add it to your LinkedIn profile.

Why can’t I audit this course?

This course is currently available only to learners who have paid or received financial aid, when available.

Is financial aid available?

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

Data Engineering with Scala and Spark

Ce cours n'est pas disponible en Français (France)

Nous sommes actuellement en train de le traduire dans plus de langues.

Data Engineering with Scala and Spark

Instructeur : Packt - Course Instructors

Inclus avec

13 modules

Obtenez un aperçu d'un sujet et apprenez les principes fondamentaux.

niveau Intermédiaire

Expérience recommandée

2 semaines à compléter

à 10 heures par semaine

Planning flexible

Apprenez à votre propre rythme

13 modules

Obtenez un aperçu d'un sujet et apprenez les principes fondamentaux.

niveau Intermédiaire

Expérience recommandée

2 semaines à compléter

à 10 heures par semaine

Planning flexible

Apprenez à votre propre rythme

Ce que vous apprendrez

Set up a development environment for building data pipelines in Scala
Use Spark DataFrames, Datasets, and SQL with Scala for data processing
Profile and clean data using Deequ for improved data quality

Compétences que vous acquerrez

Catégorie : Data Quality
Catégorie : Performance Tuning
Catégorie : Test Driven Development (TDD)
Catégorie : CI/CD
Catégorie : Data Store
Catégorie : Unit Testing
Catégorie : Continuous Deployment
Catégorie : Maintainability
Catégorie : Data Architecture
Catégorie : Continuous Integration
Catégorie : Data Transformation
Catégorie : Data Integrity
Catégorie : Data Pipelines
Catégorie : Data Processing
Catégorie : Data Validation

Outils que vous découvrirez

Catégorie : Apache Airflow
Catégorie : Data Lakes
Catégorie : Apache Kafka
Catégorie : Scala Programming
Catégorie : Apache Spark

Détails à connaître

Certificat partageable

Ajouter à votre profil LinkedIn

Récemment mis à jour !

mars 2026

Évaluations

13 devoirs

Enseigné en Anglais

Découvrez comment les employés des entreprises prestigieuses maîtrisent des compétences recherchées

En savoir plus sur Coursera pour les affaires

logos de Petrobras, TATA, Danone, Capgemini, P&G et L'Oreal

Il y a 13 modules dans ce cours

This course is designed to equip data engineers with the skills to build scalable and efficient data pipelines using Scala and Spark. Data engineers will learn best practices for development, testing, and deployment in cloud environments, with a focus on optimizing performance and ensuring data quality. The course provides the necessary tools to transform raw data into actionable insights, making it highly relevant in today’s data-driven world.

Throughout the course, learners will improve their data engineering skills by mastering techniques for building both streaming and batch data pipelines. The content emphasizes practical outcomes such as performance tuning and data profiling. With hands-on examples and step-by-step guidance, learners will gain a solid understanding of real-time and batch processing pipelines. What makes this course unique is its combination of foundational theory and real-world applications. By the end, you will be able to use Scala and Spark to process large datasets and optimize pipelines in cloud environments effectively. This course is ideal for data engineers with some experience in data processing. While it assumes familiarity with data engineering concepts and cloud technologies, anyone eager to improve their skills in Scala and Spark will benefit from the practical, step-by-step approach.

In this section, we explore functional programming, higher-order functions, polymorphic functions, and pattern matching in Scala for data engineering applications.

Inclus

2 vidéos6 lectures1 devoir

2 vidéosTotal 2 minutes

Course Overview1 minute
Scala Essentials for Data Engineers - Overview Video1 minute

6 lecturesTotal 120 minutes

Introduction10 minutes
Understanding Objects, Classes, and Traits10 minutes
Trait10 minutes
Examples of HOFs from the Scala Collection Library30 minutes
Understanding Polymorphic Functions30 minutes
Understanding Pattern Matching30 minutes

1 devoirTotal 10 minutes

Scala Essentials for Data Engineers10 minutes

In this section, we explore cloud-based and local environments for data engineering pipelines, focusing on setup processes, trade-offs, and practical applications.

Inclus

1 vidéo5 lectures1 devoir

In this section, we explore Apache Spark's APIs, focusing on DataFrame and Dataset for distributed data processing.

Inclus

1 vidéo3 lectures1 devoir

In this section, we explore using Spark JDBC API for database access, designing database interfaces, and performing operations with configuration loading.

Inclus

1 vidéo3 lectures1 devoir

In this section, we explore object stores, data lakes, and lakehouses, focusing on their roles in managing large-scale data workflows efficiently.

Inclus

1 vidéo6 lectures1 devoir

In this section, we explore Spark transformations, aggregations, joins, and window functions to enhance data processing for BI and analytics. Key concepts include efficient data manipulation and pipeline development.

Inclus

1 vidéo4 lectures1 devoir

In this section, we explore Deequ for implementing data quality checks, analyzing completeness and accuracy, and defining constraints to ensure reliable data pipelines.

Inclus

1 vidéo3 lectures1 devoir

In this section, we explore test-driven development, static code analysis, and linting to improve code quality, maintainability, and consistency in data engineering projects.

Inclus

1 vidéo4 lectures1 devoir

In this section, we explore CI/CD practices with GitHub to automate Scala data pipeline workflows, focusing on GitHub Actions, version control, and reliable deployment processes.

Inclus

1 vidéo4 lectures1 devoir

In this section, we explore data pipeline orchestration using tools like Airflow, Argo, Databricks, and Azure Data Factory. We focus on workflow design, task management, and real-world implementation strategies.

Inclus

1 vidéo6 lectures1 devoir

1 vidéoTotal 1 minute

Data Pipeline Orchestration - Overview Video1 minute

6 lecturesTotal 80 minutes

Introduction10 minutes
Monitoring and UI10 minutes
Working with Argo Workflows20 minutes
Creating an Argo Workflow10 minutes
Using Databricks Workflows20 minutes
Leveraging Azure Data Factory10 minutes

1 devoirTotal 10 minutes

Data Pipeline Orchestration Fundamentals10 minutes

In this section, we analyze Spark UI metrics to identify performance issues, optimize data shuffling, and right-size compute resources for efficient data processing.

Inclus

1 vidéo4 lectures1 devoir

In this section, we explore building batch pipelines using Spark and Scala, focusing on medallion architecture, data ingestion, transformation, and orchestration for scalable data processing.

Inclus

1 vidéo5 lectures1 devoir

In this section, we explore building real-time data pipelines using Spark, Scala, and Kafka for IoT applications. Key concepts include data ingestion, transformation, and serving layer design.

Inclus

1 vidéo4 lectures1 devoir

Instructeur

Packt - Course Instructors

Packt

1 763 Cours500 581 apprenants

Offert par

Packt

En savoir plus sur Data Management

Statut : Essai gratuit
Packt
Apache Spark with Scala – Hands-On with Big Data!
Cours
Statut : Essai gratuit
Duke University
Spark, Hadoop, and Snowflake for Data Engineering
Cours
Statut : Essai gratuit
EDUCBA
Apache Spark with Scala: Master Data Building & Analysis
Cours
Statut : Essai gratuit
Coursera
Real-Time, Real Fast: Kafka & Spark for Data Engineers
Spécialisation

Pour quelles raisons les étudiants sur Coursera nous choisissent-ils pour leur carrière ?

Felipe M.

Étudiant(e) depuis 2018

’Pouvoir suivre des cours à mon rythme à été une expérience extraordinaire. Je peux apprendre chaque fois que mon emploi du temps me le permet et en fonction de mon humeur.’

Jennifer J.

Étudiant(e) depuis 2020

’J'ai directement appliqué les concepts et les compétences que j'ai appris de mes cours à un nouveau projet passionnant au travail.’

Larry W.

Étudiant(e) depuis 2021

’Lorsque j'ai besoin de cours sur des sujets que mon université ne propose pas, Coursera est l'un des meilleurs endroits où se rendre.’

Chaitanya A.

’Apprendre, ce n'est pas seulement s'améliorer dans son travail : c'est bien plus que cela. Coursera me permet d'apprendre sans limites.’

Ouvrez de nouvelles portes avec Coursera Plus

Accès illimité à 10,000+ cours de niveau international, projets pratiques et programmes de certification prêts à l'emploi - tous inclus dans votre abonnement.

Faites progresser votre carrière avec un diplôme en ligne

Obtenez un diplôme auprès d’universités de renommée mondiale - 100 % en ligne

Découvrir les diplômes

Rejoignez plus de 3 400 entreprises mondiales qui ont choisi Coursera pour les affaires

Améliorez les compétences de vos employés pour exceller dans l’économie numérique

Foire Aux Questions

Yes, you can preview the first video and view the syllabus before you enroll. You must purchase the course to access content not included in the preview.

If you decide to enroll in the course before the session start date, you will have access to all of the lecture videos and readings for the course. You’ll be able to submit assignments once the session starts.

Once you enroll and your session begins, you will have access to all videos and other resources, including reading items and the course discussion forum. You’ll be able to view and submit practice assessments, and complete required graded assignments to earn a grade and a Course Certificate.

Plus de questions

Visitez le Centre d'Aide pour les Étudiants

Aide financière disponible,

Data Engineering with Scala and Spark

Ce cours n'est pas disponible en Français (France)

Data Engineering with Scala and Spark

Ce que vous apprendrez

Compétences que vous acquerrez

Outils que vous découvrirez

Détails à connaître

Découvrez comment les employés des entreprises prestigieuses maîtrisent des compétences recherchées

Il y a 13 modules dans ce cours

Scala Essentials for Data Engineers

Inclus

Environment Setup

Inclus

An Introduction to Apache Spark and Its APIs DataFrame Dataset and Spark SQL

Inclus

Working with Databases

Inclus

Object Stores and Data Lakes

Inclus

Understanding Data Transformation

Inclus

Data Profiling and Data Quality

Inclus

Test-Driven Development, Code Health, and Maintainability

Inclus

CI/CD with GitHub

Inclus

Data Pipeline Orchestration

Inclus

Performance Tuning

Inclus

Building Batch Pipelines Using Spark and Scala

Inclus

Building Streaming Pipelines Using Spark and Scala

Inclus

Instructeur

Offert par

En savoir plus sur Data Management

Apache Spark with Scala – Hands-On with Big Data!

Spark, Hadoop, and Snowflake for Data Engineering

Apache Spark with Scala: Master Data Building & Analysis

Real-Time, Real Fast: Kafka & Spark for Data Engineers

Pour quelles raisons les étudiants sur Coursera nous choisissent-ils pour leur carrière ?

Felipe M.

Jennifer J.

Larry W.

Chaitanya A.

Ouvrez de nouvelles portes avec Coursera Plus

Faites progresser votre carrière avec un diplôme en ligne

Rejoignez plus de 3 400 entreprises mondiales qui ont choisi Coursera pour les affaires

Foire Aux Questions

Can I preview a course before enrolling?

When will I have access to the lectures and assignments?

What will I get when I enroll?

Plus de questions