The Azure Databricks Cookbook shows you how to work with the latest as well as older versions of Apache Spark and integrate with various Azure resources for orchestrating, deploying, and monitoring big data solutions. You'll use Azure Databricks to build end-to-end solutions and address challenges in securing, productionizing, and monitoring them.

推荐体验
推荐体验
中级
Ideal for data engineers, data scientists, and big data professionals with prior experience in Apache Spark and Azure.
推荐体验
推荐体验
中级
Ideal for data engineers, data scientists, and big data professionals with prior experience in Apache Spark and Azure.
您将学到什么
Integrate Azure Databricks with Azure Synapse Analytics, Cosmos DB, and Kafka clusters.
Build and optimize data pipelines using Delta tables and Databricks SQL.
Deploy and productionize data solutions using CI/CD pipelines.
您将获得的技能
要了解的详细信息

添加到您的领英档案
10 项作业
April 2026
了解顶级公司的员工如何掌握热门技能

该课程共有10个模块
This module guides learners through the process of setting up and managing an Azure Databricks workspace using the Azure CLI. You will explore how to add users and groups, configure permissions, and work with notebooks and jobs to streamline data application development. By the end, you'll be equipped to automate deployments and foster collaboration within Azure Databricks.
涵盖的内容
1个视频4篇阅读材料1个作业
1个视频•总计1分钟
- Overview•1分钟
4篇阅读材料•总计24分钟
- Introduction•5分钟
- Creating a Databricks Service Using the Azure CLI•6分钟
- Adding Users and Groups to the Workspace•6分钟
- Getting Started with Notebooks and Jobs in Azure Databricks•7分钟
1个作业•总计16分钟
- Azure Databricks Service Fundamentals•16分钟
This module explores how to efficiently read and write data between Azure Databricks and a variety of Azure services and file formats. Learners will gain hands-on experience connecting to Azure Blob Storage, ADLS Gen2, Azure SQL Database, Azure Synapse Analytics, and Azure Cosmos DB, as well as working with CSV, Parquet, and JSON files. By the end, you will be able to manage data ingestion and storage across multiple Azure platforms.
涵盖的内容
1个视频8篇阅读材料1个作业
1个视频•总计1分钟
- Overview•1分钟
8篇阅读材料•总计56分钟
- Introduction•13分钟
- Reading and Writing Data from and to Azure Blob Storage•6分钟
- Reading and Writing Data from and to ADLS Gen2•5分钟
- Reading and Writing Data from and to an Azure SQL Database Using Native Connectors•8分钟
- Reading and Writing Data from and to Azure Synapse SQL (Dedicated SQL Pool) Using Native Connectors•9分钟
- Reading and Writing Data from and to Azure Cosmos DB•6分钟
- Reading and Writing Data from and to CSV and Parquet•5分钟
- Reading and Writing Data from and to JSON, Including Nested JSON•4分钟
1个作业•总计16分钟
- Data Handling in Azure Ecosystem•16分钟
This module explores how Spark executes queries, including how to inspect execution details, understand schema inference, and interpret query execution plans. Learners will also examine how joins and partitions impact performance and how to optimize query execution using Spark's tools and parameters. By the end, you'll be equipped to analyze and improve the efficiency of Spark applications.
涵盖的内容
1个视频7篇阅读材料1个作业
1个视频•总计1分钟
- Overview•1分钟
7篇阅读材料•总计40分钟
- Introduction•7分钟
- Checking the Execution Details of All the Executed Spark Queries via the Spark UI•4分钟
- Deep Diving into Schema Inference•5分钟
- Looking into the Query Execution Plan•4分钟
- How Joins Work in Spark•6分钟
- Learning About Input Partitions•7分钟
- Learning About Shuffle Partitions•7分钟
1个作业•总计16分钟
- Spark Query Execution Fundamentals•16分钟
This module introduces the fundamentals of processing streaming data using tools like Azure Event Hubs, Apache Kafka, and Spark Structured Streaming. Learners will explore how to connect to real-time data sources, manage log file streams, and implement advanced features such as window aggregation, trigger options, and checkpointing for fault tolerance.
涵盖的内容
1个视频7篇阅读材料1个作业
1个视频•总计1分钟
- Overview•1分钟
7篇阅读材料•总计41分钟
- Introduction•7分钟
- Reading Streaming Data from Azure Event Hubs•6分钟
- Reading Data from Event Hubs for Kafka•6分钟
- Streaming Data from Log Files•4分钟
- Understanding Trigger Options•6分钟
- Understanding Window Aggregation on Streaming Data•7分钟
- Understanding Offsets and Checkpoints•5分钟
1个作业•总计16分钟
- Streaming Data Fundamentals•16分钟
This module guides learners through securely managing credentials and configuration settings in Azure by leveraging Key Vault, App Configuration, and Log Analytics. You will gain hands-on experience creating and deploying these resources using ARM templates and the Azure CLI, as well as learn how to centralize monitoring for Azure services.
涵盖的内容
1个视频4篇阅读材料1个作业
1个视频•总计1分钟
- Overview•1分钟
4篇阅读材料•总计23分钟
- Introduction•5分钟
- Creating an Azure Key Vault to Store Secrets Using ARM Templates•6分钟
- Creating an App Configuration Resource•6分钟
- Creating a Log Analytics Workspace•6分钟
1个作业•总计16分钟
- Secure Cloud Configuration and Monitoring•16分钟
This module introduces the core concepts and practical applications of Delta Lake on Azure Databricks, including streaming data integration, data versioning, and ACID transaction support. Learners will explore how to optimize Delta tables, enforce data integrity with constraints, and manage concurrent operations for reliable big data solutions.
涵盖的内容
1个视频7篇阅读材料1个作业
1个视频•总计1分钟
- Overview•1分钟
7篇阅读材料•总计42分钟
- Introduction•7分钟
- Streaming Reads and Writes to Delta Tables•4分钟
- Delta Table Data Format•7分钟
- Handling Concurrency•9分钟
- Delta Table Performance Optimization•6分钟
- Constraints in Delta Tables•5分钟
- Versioning in Delta Tables•4分钟
1个作业•总计16分钟
- Exploring Delta Lake Capabilities•16分钟
This module guides learners through building a modern data warehouse solution on Azure, integrating both batch and real-time analytics. You will create and configure Azure resources, simulate streaming data, process and transform data using Structured Streaming, and visualize near-real-time analytics with Databricks and Power BI.
涵盖的内容
1个视频8篇阅读材料1个作业
1个视频•总计1分钟
- Overview•1分钟
8篇阅读材料•总计40分钟
- Introduction•6分钟
- Creating Required Azure Resources for the E2E Demonstration•4分钟
- Simulating a Workload for Streaming Data•4分钟
- Processing Streaming and Batch Data Using Structured Streaming•4分钟
- Understanding the Various Stages of Transforming Data•4分钟
- Loading the Transformed Data into Azure Cosmos DB and a Synapse Dedicated Pool•9分钟
- Creating a Visualization and Dashboard in a Notebook for Near-Real-Time Analytics•4分钟
- Creating a Visualization in Power BI for Near-Real-Time Analytics•5分钟
1个作业•总计16分钟
- Data Processing and Cloud Integration Fundamentals•16分钟
This module introduces learners to the core features of Databricks SQL, including managing user access, utilizing query parameters and filters, and creating visualizations. Participants will gain practical skills for querying large datasets and presenting insights effectively within the Databricks environment.
涵盖的内容
1个视频4篇阅读材料1个作业
1个视频•总计1分钟
- Overview•1分钟
4篇阅读材料•总计25分钟
- Introduction•7分钟
- Granting Access to Objects to the User•6分钟
- Using Query Parameters and Filters•5分钟
- Introduction to Visualizations in Databricks SQL•7分钟
1个作业•总计16分钟
- Databricks SQL Fundamentals•16分钟
This module explores how to integrate DevOps practices with Azure Databricks, focusing on version control, CI/CD pipelines, and automated deployment strategies. Learners will gain hands-on experience connecting GitHub and Azure DevOps for notebook management and deploying resources across multiple environments. By the end, you'll be able to implement and manage CI/CD workflows tailored for Azure Databricks projects.
涵盖的内容
1个视频5篇阅读材料1个作业
1个视频•总计1分钟
- Overview•1分钟
5篇阅读材料•总计30分钟
- Introduction•6分钟
- Using GitHub for Azure Databricks Notebook Version Control•4分钟
- Understanding the CI/CD Process for Azure Databricks•8分钟
- Deploying Notebooks to Multiple Environments•6分钟
- Enabling CI/CD in an Azure DevOps Build and Release Pipeline•6分钟
1个作业•总计16分钟
- CI/CD and DevOps Integration in Azure Databricks•16分钟
This module explores essential security and monitoring practices in Azure Databricks, including configuring access controls, credential passthrough, and network security. Learners will also discover how to monitor cluster health using Ganglia and implement granular data access restrictions. By the end, you will be able to secure data and resources effectively within Azure Databricks environments.
涵盖的内容
1个视频7篇阅读材料1个作业
1个视频•总计1分钟
- Overview•1分钟
7篇阅读材料•总计43分钟
- Introduction•7分钟
- Creating ACLs Using Storage Explorer and PowerShell•9分钟
- How to Configure Credential Passthrough•5分钟
- How to Restrict Data Access to Users Using RBAC•6分钟
- How to Restrict Data Access to Users Using ACLs•5分钟
- Deploying Azure Databricks in a VNet and Accessing a Secure Storage Account•5分钟
- Using Ganglia Reports for Cluster Health•6分钟
1个作业•总计16分钟
- Securing Data and User Access in Azure Databricks•16分钟
位教师

提供方

提供方

Packt helps tech professionals put software to work by distilling and sharing the working knowledge of their peers. Packt is an established global technical learning content provider, founded in Birmingham, UK, with over twenty years of experience delivering premium, rich content from groundbreaking authors on a wide range of emerging and popular technologies.
人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

Jennifer J.

Larry W.

