In this 1 hour long project-based course, you will learn to build a logistic regression model using Pyspark MLLIB to classify patients as either diabetic or non-diabetic. We will use the popular Pima Indian Diabetes data set. Our goal is to use a simple logistic regression classifier from the pyspark Machine learning library for diabetes classification. We will be carrying out the entire project on the Google Colab environment with the installation of Pyspark.You will need a free Gmail account to complete this project. Please be aware of the fact that the dataset and the model in this project, can not be used in the real-life. We are only using this data for the educational purpose.

您将学到什么
Learn to Build and Train Logistic Regression Classifier using Pyspark MLLIB
Learn to set up Pyspark on the Google Colab Environment
Learn to work with Pyspark Dataframe
您将练习的技能
要了解的详细信息

添加到您的领英档案
仅桌面可用
了解顶级公司的员工如何掌握热门技能

在 2 小时内学习、练习并应用岗位必备技能
- 接受行业专家的培训
- 获得解决实训工作任务的实践经验
- 使用最新的工具和技术来建立信心

关于此指导项目
分步进行学习
在与您的工作区一起在分屏中播放的视频中,您的授课教师将指导您完成每个步骤:
-
Introduction & Install Dependencies
-
Clone and Explore Dataset
-
Data Cleaning and Preparation
-
Correlation analysis and Feature Selection
-
Split Dataset and Build the Logistic Regression Model
-
Evaluate and Save the model
-
Model Prediction on a new set of unlabelled data
4个项目图片
学习方式
基于技能的实践学习
通过完成与工作相关的任务来练习新技能。
专家指导
使用独特的并排界面,按照预先录制的专家视频操作。
无需下载或安装
在预配置的云工作空间中访问所需的工具和资源。
仅在台式计算机上可用
此指导项目专为具有可靠互联网连接的笔记本电脑或台式计算机而设计,而不是移动设备。
人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

Jennifer J.

Larry W.

Chaitanya A.
学生评论
- 5 stars
72.72%
- 4 stars
13.63%
- 3 stars
13.63%
- 2 stars
0%
- 1 star
0%
显示 3/22 个
已于 Nov 2, 2022审阅
Solid introduction to pyspark MLLib but left much would have liked to see more model evaluation and comparison to at least another model.
已于 Oct 16, 2021审阅
Thank You for making course so simple to learn how to develop prediction model
已于 Aug 21, 2024审阅
Understand the concept easily and practice it at the same time.







