This course introduces distributed computing frameworks and big data visualization techniques. Learners will explore MapReduce, work with Apache Spark, implement transformations with PySpark, and use Spark SQL for large-scale analysis. The course concludes with building compelling dashboards and reports using Power BI for actionable business insights.

Acquérir des compétences de haut niveau avec Coursera Plus pour 199 $ (régulièrement 399 $). Économisez maintenant.

Data Processing, Exploratory Analysis and Visualization
Ce cours fait partie de Microsoft Big Data Management and Analytics Certificat Professionnel

Instructeur : Microsoft
Inclus avec
Compétences que vous acquerrez
- Catégorie : Big Data
- Catégorie : Apache Spark
- Catégorie : SQL
- Catégorie : Dashboard
- Catégorie : PySpark
- Catégorie : Distributed Computing
- Catégorie : Data Transformation
- Catégorie : Data Pipelines
- Catégorie : Databricks
- Catégorie : Self Service Technologies
- Catégorie : Power BI
- Catégorie : Business Intelligence
- Catégorie : Data Visualization Software
- Catégorie : Data Processing
- Catégorie : Performance Tuning
Détails à connaître

Ajouter à votre profil LinkedIn
Découvrez comment les employés des entreprises prestigieuses maîtrisent des compétences recherchées

Élaborez votre expertise en Data Analysis
- Apprenez de nouveaux concepts auprès d'experts du secteur
- Acquérez une compréhension de base d'un sujet ou d'un outil
- Développez des compétences professionnelles avec des projets pratiques
- Obtenez un certificat professionnel partageable auprès de Microsoft

Il y a 5 modules dans ce cours
Distributed Computing and MapReduce Concepts explores the foundational principles that enable modern organizations to process massive datasets that have outgrown the limits of single-machine computing. Through real-world examples, visual walkthroughs, hands-on labs, and guided design activities, you'll examine how data is broken into parallel tasks and executed across clusters of machines, how the Map, shuffle, and Reduce phases work together, and how common MapReduce patterns—such as counting, filtering, joining, and aggregation—solve practical big data problems efficiently and at scale.
Inclus
3 vidéos3 lectures8 devoirs
Apache Spark Architecture and Fundamentals provides a comprehensive introduction to the distributed processing engine that revolutionized big data analytics by overcoming traditional MapReduce limitations. Through real-world examples, visual walkthroughs, hands-on labs, and guided design activities, you'll examine Spark's core components, including the driver, executors, and cluster manager, explore how in-memory processing delivers dramatic performance improvements, and learn to configure and manage Spark clusters and applications for efficient large-scale data processing.
Inclus
2 vidéos3 lectures9 devoirs
Data Processing with PySpark RDDs and DataFrames focuses on practical data processing using PySpark's Python API for Apache Spark. Through real-world examples, visual walkthroughs, hands-on labs, and guided design activities, you'll implement data processing operations using both RDDs and DataFrames, develop transformation pipelines, apply common data cleaning and preparation techniques, and optimize PySpark code for better performance across enterprise-scale big data scenarios.
Inclus
3 vidéos3 lectures10 devoirs
Advanced Data Processing with Spark SQL introduces Spark SQL as a powerful interface for structured data processing in distributed environments. Through real-world examples, visual walkthroughs, hands-on labs, and guided design activities, you'll master SQL operations at scale, from basic queries to complex analytical operations, learn to create and manage temporary views and tables, and optimize query performance for production workloads that would overwhelm traditional database systems.
Inclus
3 vidéos3 lectures10 devoirs
Data Visualization for Big Data with Power BI introduces comprehensive visualization techniques specifically designed for big data environments using Microsoft Power BI. Through real-world examples, visual walkthroughs, hands-on labs, and guided design activities, you'll learn to connect Power BI to various big data sources, create effective visualizations for large datasets, build interactive dashboards that enable self-service analytics, and implement best practices for handling performance challenges when visualizing massive datasets.
Inclus
3 vidéos3 lectures10 devoirs
Obtenez un certificat professionnel
Ajoutez ce titre à votre profil LinkedIn, à votre curriculum vitae ou à votre CV. Partagez-le sur les médias sociaux et dans votre évaluation des performances.
En savoir plus sur Data Analysis
Pour quelles raisons les étudiants sur Coursera nous choisissent-ils pour leur carrière ?




Foire Aux Questions
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Plus de questions
Aide financière disponible,
¹ Certains travaux de ce cours sont notés par l'IA. Pour ces travaux, vos Données internes seront utilisées conformément à Notification de confidentialité de Coursera.








