Engineer, Validate, and Govern ML Data

本课程是多个项目的一部分。

位教师：ansrsource instructors

访问权限由 New York State Department of Labor 提供

1个模块

深入了解一个主题并学习基础知识。

中级等级

推荐体验

2 小时完成

灵活的计划

自行安排学习进度

1个模块

深入了解一个主题并学习基础知识。

中级等级

推荐体验

2 小时完成

灵活的计划

自行安排学习进度

您将获得的技能

Data Governance

您将学习的工具

要了解的详细信息

可分享的证书

添加到您的领英档案

作业

3 任务¹

AI 评分请参见免责声明

授课语言：英语（English）

了解顶级公司的员工如何掌握热门技能

了解关于 Coursera for Business 的更多信息

Petrobras, TATA, Danone, Capgemini, P&G 和 L'Oreal 的徽标

积累特定领域的专业知识

此课程作为的一部分提供

在注册此课程时，您还需要选择一个特定的合作项目。

向行业专家学习新概念
获得对主题或工具的基础理解
通过实践项目培养工作相关技能
获得可共享的职业证书

该课程共有1个模块

This short course helps you build and validate ML-ready data pipelines with confidence. You’ll start by learning how to design ETL workflows that ingest, clean, and partition large datasets using tools like Airflow and Spark. You’ll see how real teams manage click-stream logs, handle nulls, and prepare partitioned training data at scale. Next, you’ll evaluate data quality, governance, and lineage so your pipelines remain trustworthy and reproducible. You’ll work with practical techniques like schema drift checks, expectations suites, and audit-ready lineage records. Through short videos, applied readings, hands-on practice, and a final graded assessment, you’ll walk away knowing how to engineer reliable pipelines and validate them for production use.