Coursera

Performance Engineering for Data Systems 专项课程

Coursera

Performance Engineering for Data Systems 专项课程

Optimize SQL, Spark, and Data Warehouses. Learn to diagnose bottlenecks and optimize performance in databases, warehouses, and Spark systems.

Hurix Digital
Merna Elzahaby

位教师:Hurix Digital

访问权限由 New York State Department of Labor 提供

深入学习学科知识
中级 等级

推荐体验

4 周 完成
在 10 小时 一周
灵活的计划
自行安排学习进度
深入学习学科知识
中级 等级

推荐体验

4 周 完成
在 10 小时 一周
灵活的计划
自行安排学习进度

您将学到什么

  • Analyze SQL execution plans and Spark UI metrics to diagnose performance bottlenecks and implement targeted optimizations.

  • Design scalable database schemas, partitioning strategies, and storage architectures that balance performance with cost.

  • Engineer resilient cloud data infrastructure using IaC, disaster recovery planning, and systematic resource management.

要了解的详细信息

可分享的证书

添加到您的领英档案

授课语言:英语(English)
最近已更新!

February 2026

了解顶级公司的员工如何掌握热门技能

Petrobras, TATA, Danone, Capgemini, P&G 和 L'Oreal 的徽标

精进特定领域的专业知识

  • 向大学和行业专家学习热门技能
  • 借助实践项目精通一门科目或一个工具
  • 培养对关键概念的深入理解
  • 通过 Coursera 获得职业证书

专业化 - 7门课程系列

您将学到什么

  • Performance optimization requires methodical analysis of execution plans to identify root causes, not just symptoms.

  • Query restructuring with CTEs, optimized joins, and window functions can dramatically improve execution efficiency.

  • Index design needs ongoing analysis of query patterns and data access requirements for sustainable performance.

  • Scalable systems depend on proactive monitoring and optimization cycles that prevent production bottlenecks.

SQL Infrastructure: Secure and Optimize

SQL Infrastructure: Secure and Optimize

第 2 门课程 4小时

您将学到什么

  • Proactive resource management prevents performance degradation and ensures consistent query execution across diverse workloads and user groups.

  • Security through least-privilege access requires continuous monitoring and systematic auditing of permissions against actual business requirements.

  • Effective incident response depends on blameless post-mortem processes that focus on systemic improvements rather than individual accountability.

  • Operational excellence in data infrastructure requires balancing performance, security, and reliability engineering principles.

您将获得的技能

类别:Capacity Management
类别:Data Security
类别:Compliance Auditing
类别:Identity and Access Management
类别:Resource Management
类别:Root Cause Analysis
类别:Problem Management
类别:Site Reliability Engineering
类别:Role-Based Access Control (RBAC)
Design & Optimize SQL Database Schemas

Design & Optimize SQL Database Schemas

第 3 门课程 3小时

您将学到什么

  • Denormalization boosts query speed but demands careful analysis of consistency risks and maintenance costs.

  • Partitioning and clustering strategies must align with actual query patterns and access methods to deliver meaningful performance gains.

  • ER diagrams serve as documentation and validation tools, enabling better communication and system understanding.

  • Schema optimization balances query performance, data integrity, storage efficiency, and maintenance complexity.

您将获得的技能

类别:Database Design
类别:Database Architecture and Administration
类别:Star Schema
类别:Technical Documentation
类别:Database Development
类别:Data Modeling
类别:Snowflake Schema
类别:SQL

您将学到什么

  • Infrastructure as Code automates data platform deployments, replacing manual processes with version-controlled, repeatable systems.

  • Cost optimization uses performance benchmarking and data analysis to identify efficient compute/storage configs for specific workloads.

  • Business continuity requires proactive disaster recovery with automated failover and continuous replication for strict recovery goals.

  • Successful cloud data engineering balances performance, cost, and reliability through strategic design and continuous monitoring.

您将获得的技能

类别:Disaster Recovery
类别:Business Continuity
类别:Data Warehousing
类别:Data Architecture
类别:Benchmarking
类别:AWS CloudFormation
类别:Automation
类别:Terraform
类别:Data Infrastructure
类别:IT Infrastructure
类别:Cost Management
类别:Cloud Computing Architecture
类别:Performance Analysis
类别:Infrastructure as Code (IaC)
类别:Business Continuity Planning
类别:Cloud Deployment
类别:Capacity Management

您将学到什么

  • Performance optimization is a systematic process requiring analysis of data access patterns, not random configuration changes.

  • Strategic partitioning minimizes expensive network shuffles and is the foundation of scalable Spark applications.

  • Intelligent caching of reusable intermediate datasets can dramatically reduce computation costs and improve job reliability.

  • The Spark UI provides actionable insights that guide optimization decisions and enable data-driven performance improvements.

您将获得的技能

类别:Performance Tuning
类别:Apache Spark
类别:Data Processing
类别:PySpark
类别:Systems Analysis
类别:Data Pipelines

您将学到什么

  • Performance bottlenecks in distributed systems often stem from uneven data distribution rather than insufficient computational resources.

  • Visual execution plan analysis is essential for identifying specific stages where data processing imbalances occur.

  • Proactive partition strategy selection prevents performance degradation more effectively than reactive optimization

  • Spark's shuffle.partitions configuration and broadcast join patterns are fundamental tools for sustainable pipeline optimization.

您将获得的技能

类别:Performance Tuning
类别:Apache Spark
类别:Scalability
类别:PySpark
类别:Performance Analysis
类别:Distributed Computing
类别:Debugging
类别:Data Processing
类别:Data Pipelines
Optimize Spark Performance & Throughput

Optimize Spark Performance & Throughput

第 7 门课程 4小时

您将学到什么

  • Inspect Spark UI and metrics (task duration, shuffle I/O, executor CPU/mem) to find bottlenecks and recommend actionable optimizations.

  • Apply partitioning and skew mitigation (salting/custom partitioner) & reduce shuffle (broadcast joins, avoid groupByKey, AQE) to improve parallelism.

  • Configure executors, cores, memory, dynamic allocation and parallelism/caching settings to maximize throughput while meeting defined SLA targets.

您将获得的技能

类别:Apache Spark
类别:Performance Tuning
类别:PySpark
类别:Database Management
类别:Debugging
类别:Process Optimization
类别:System Configuration
类别:Scalability
类别:Performance Analysis
类别:Resource Allocation
类别:Job Analysis

获得职业证书

将此证书添加到您的 LinkedIn 个人资料、简历或履历中。在社交媒体和绩效考核中分享。

位教师

Hurix Digital
Coursera
283 门课程 19,983 名学生
Merna Elzahaby
Coursera
1 门课程 22 名学生

提供方

Coursera

人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

自 2018开始学习的学生
''能够按照自己的速度和节奏学习课程是一次很棒的经历。只要符合自己的时间表和心情,我就可以学习。'

Jennifer J.

自 2020开始学习的学生
''我直接将从课程中学到的概念和技能应用到一个令人兴奋的新工作项目中。'

Larry W.

自 2021开始学习的学生
''如果我的大学不提供我需要的主题课程,Coursera 便是最好的去处之一。'

Chaitanya A.

''学习不仅仅是在工作中做的更好:它远不止于此。Coursera 让我无限制地学习。'