Data Analysis Using Hadoop Tools

Data Analysis Using Hadoop Tools

本课程是 Big Data Processing Using Hadoop 专项课程的一部分

位教师：Karthik Shyamsunder

访问权限由 Coursera Learning Team 提供

5个模块

深入了解一个主题并学习基础知识。

中级等级

推荐体验

2 周完成

在 10 小时一周

灵活的计划

自行安排学习进度

5个模块

深入了解一个主题并学习基础知识。

中级等级

推荐体验

2 周完成

在 10 小时一周

灵活的计划

自行安排学习进度

您将学到什么

Learn to set up and configure Hive, Pig, HBase, and Spark for efficient big data analysis and processing within the Hadoop ecosystem.
Master Hive’s SQL-like queries for data retrieval, management, and optimization using partitions and joins to enhance query performance.
Understand Pig Latin for scripting data transformations, including the use of operators like join and debug to process large datasets effectively.
Gain expertise in NoSQL databases with HBase for real-time read/write operations, and use Spark’s core programming model for fast data processing.

您将获得的技能

要了解的详细信息

可分享的证书

添加到您的领英档案

作业

15 项作业

授课语言：英语（English）

了解顶级公司的员工如何掌握热门技能

了解关于 Coursera for Business 的更多信息

Petrobras, TATA, Danone, Capgemini, P&G 和 L'Oreal 的徽标

积累特定领域的专业知识

本课程是 Big Data Processing Using Hadoop 专项课程专项课程的一部分

在注册此课程时，您还会同时注册此专项课程。

向行业专家学习新概念
获得对主题或工具的基础理解
通过实践项目培养工作相关技能
获得可共享的职业证书

该课程共有5个模块

The course "Data Analysis Using Hadoop Tools" provides a thorough and hands-on introduction to key tools within the Hadoop ecosystem, such as Hive, Pig, HBase, and Apache Spark, for data processing, management, and analysis. Learners will gain practical experience with Hive's SQL-like interface for complex data querying, Pig Latin scripting for data transformation, and HBase's NoSQL capabilities for efficient big data management. The course also covers Apache Spark's powerful in-memory computation capabilities for high-performance data processing tasks. By the end, participants will be equipped with the skills to leverage these technologies within the Hadoop platform to address real-world big data challenges.

What makes this course unique is its comprehensive approach to integrating various Hadoop tools into a cohesive workflow. You'll not only learn how to use each tool individually but also understand how to effectively combine them to optimize data processing and analysis. Through hands-on exercises and examples, you'll gain the confidence and skills to tackle complex data challenges and extract valuable insights from big data. Whether you're looking to enhance your data analysis capabilities for work or want to deepen your knowledge of Hadoop and big data tools, this course offers valuable skills that will help you succeed.

This course provides a comprehensive overview of key tools within the Hadoop ecosystem, including Hive, Pig, HBase, and Apache Spark. You will learn how to set up and configure these technologies for data processing, management, and analysis. The course covers Hive's query execution, Pig's scripting language, and HBase's NoSQL capabilities. You'll also gain hands-on experience with Spark's core programming model for efficient big data processing. By the end, you'll be equipped to leverage these tools for optimized data analysis and management.

涵盖的内容

2篇阅读材料

In this module, we will cover MapReduce programming using a higher-level language called Hive which translates Hive SQL-like queries to MapReduce.

涵盖的内容

9个视频7篇阅读材料4个作业

9个视频总计107分钟

Introduction - Hive 2分钟
Hive Overview and Architecture 23分钟
Setting up Hive 26分钟
Simple Hive Example 20分钟
Loading Data 9分钟
Hive Statements 11分钟
Partitions 6分钟
Joins 8分钟
Summary- Hive 2分钟

7篇阅读材料总计105分钟

Hive Overview and Architecture 10分钟
Setting up Hive 10分钟
Simple Hive Example 10分钟
Loading Data 10分钟
Hive Statement 10分钟
Partitions and Joins in Hive 15分钟
Self-Reflective Reading: Balancing Hive, Java MapReduce, and Pig in Hadoop Architectures 40分钟

4个作业总计105分钟

Data Analysis using Hive 60分钟
Introduction to Hive: Overview, Architecture, and Setup 15分钟
Working with Hive: Basic Examples, Data Loading, and Hive Statements 15分钟
Advanced Hive: Partitions, Joins, and Summary 15分钟

In this module, we will cover MapReduce programming using a higher-level language called Pig which translates Pig Latin queries to MapReduce.

涵盖的内容

9个视频7篇阅读材料4个作业

9个视频总计132分钟

Introduction - Pig 2分钟
Pig: Overview and Architecture 22分钟
Setting up Pig 8分钟
Grunt Interactive Shell 18分钟
Pig Latin Language Basics 10分钟
Pig Data Types and Schema 15分钟
Core Relational Operators 14分钟
Join Operators 26分钟
Debug Operators 17分钟

7篇阅读材料总计102分钟

Pig: Overview and Architecture 15分钟
Grunt Interactive Shell 7分钟
Exploring Pig Latin Basics: Data Structures, Syntax, and Commands 10分钟
Understanding Schemas, Data Types, and Functions in Apache Pig 10分钟
Core Relational Operators in Pig Latin: An Overview 10分钟
Exploring Relational Join Operators in Apache Pig 10分钟
Self-Reflective Reading: Hive vs. Pig for Your Big Data Strategy 40分钟

4个作业总计105分钟

Data Analysis using Pig 60分钟
Introduction to Pig: Overview, Architecture, and Setup 15分钟
Pig Fundamentals: Grunt Shell, Pig Latin Basics, and Data Types 15分钟
Pig Advanced: Core Operators, Joins, Debugging, and Summary 15分钟

In this module, we will start with a primer of NoSQL databases and then dive into HBase, a NoSQL database built on top of Hadoop that allows for random, real-time read/write access to your Big Data.

涵盖的内容

8个视频3篇阅读材料3个作业

8个视频总计175分钟

Introduction - HBase 2分钟
NoSQL Primer 35分钟
HBase Overview and Architecture 31分钟
Setting up HBase 25分钟
HBase Data Model 16分钟
HBase Shell 33分钟
CRUD operations using Java API 31分钟
Summary - HBase 3分钟

3篇阅读材料总计60分钟

HBase Overview and Architecture 10分钟
HBase Data Model 10分钟
Self-Reflective Reading: Coexisting Databases: Balancing NoSQL and RDBMS in Modern Applications 40分钟