Machine Learning with Small Data Part 1

Coursera PlusMonthly 3 个月课程4 折优惠 ，让你轻松掌握闪耀技能。立即节省

Machine Learning with Small Data Part 1

位教师：Sarah Ostadabbas

包含在中

了解更多

7个模块

深入了解一个主题并学习基础知识。

1 周完成

在 10 小时一周

灵活的计划

自行安排学习进度

7个模块

深入了解一个主题并学习基础知识。

1 周完成

在 10 小时一周

灵活的计划

自行安排学习进度

您将获得的技能

要了解的详细信息

可分享的证书

添加到您的领英档案

作业

8 项作业

授课语言：英语（English）

了解顶级公司的员工如何掌握热门技能

了解关于 Coursera for Business 的更多信息

Petrobras, TATA, Danone, Capgemini, P&G 和 L'Oreal 的徽标

该课程共有7个模块

This course addresses the challenge of machine learning (ML) in the context of small datasets, a significant issue due to ML's increasing data demands. Despite ML's success in various fields, many areas can't provide large labeled datasets because of costs, privacy, or security laws. As big data becomes standard, efficiently learning from smaller datasets is crucial. This course, ideal for graduate students with some ML experience, focuses on modern deep learning techniques for small data applications relevant in healthcare, military, and various industry sectors. Prerequisites include ML familiarity and Python proficiency. Deep learning experience is not necessary but beneficial.

In this module, we will explore the pivotal role of data as the foundation for machine learning algorithms. We begin by discussing the significance of large datasets in training deep learning models as these datasets are crucial for the models’ successful application and effectiveness. We will also delve into the challenges associated with small datasets, particularly in sensitive fields such as healthcare and defense, where data acquisition is often difficult, costly, or subject to stringent privacy and security regulations. To address these challenges, the course will introduce various strategies for making the most of limited data, including data-efficient machine learning techniques and the use of synthetic data augmentation. Additionally, we will present the course structure and discuss a curated selection of research papers that align with and enrich our course topics.

涵盖的内容

2个视频13篇阅读材料1个作业

2个视频总计16分钟

Data Matters8分钟
Setting Up Your Local Environment8分钟

13篇阅读材料总计81分钟

Course Overview1分钟
Syllabus - Machine Learning for Small Data10分钟
Academic Integrity1分钟
Data Matters—Especially for Deep Learning2分钟
Data-Parameters-Power Scaling in AI Model5分钟
Exponential Growth of Training Data10分钟
Exponential Growth of Model Complexity5分钟
Exponential Growth in Computational Resources5分钟
The Scale Paradox: When Smaller ML Models Outperform Giants5分钟
Large Datasets for Deep Learning10分钟
What is Small Data?2分钟
Installing PyTorch5分钟
Large vs. Small Datasets in Machine Learning20分钟

1个作业总计10分钟

Module 1 Quiz10分钟

In this module, we will delve into the core aspects of machine learning with a focus on the importance of data, particularly in deep learning applications. We start by emphasizing how large datasets are essential for training deep learning models effectively, as they enable the models to capture and learn from complex patterns, improving their overall performance. Additionally, we'll explore the intersection of data availability, computational power, and model capacity, highlighting how these elements interact to refine model accuracy and efficiency. Furthermore, the module will cover computing advancements beyond Moore's Law and their impact on machine learning, illustrating how modern hardware like CPUs, GPUs, and TPUs enhance computational capabilities critical for training sophisticated models. We'll also delve into scaling laws in deep learning, discussing empirical findings that show how model performance improves predictably with increases in dataset size and model complexity, although with diminishing returns. To provide a deeper theoretical foundation, we'll examine the Vapnik-Chervonenkis (VC) theory, which offers insights into how learning curves and model complexity relate to a model’s ability to generalize from training data. This discussion will extend to practical applications and theoretical limitations, helping to frame machine learning challenges in terms of data sufficiency, model fitting, and the balance between bias and variance. By the end of this module, students will have a thorough understanding of the dynamic interplay between these factors and their implications for machine learning practice and research.

涵盖的内容

1个视频19篇阅读材料2个作业1个应用程序项目

1个视频总计9分钟

Machine Learning Model Performance9分钟

19篇阅读材料总计144分钟

Ingredients Relationship10分钟
Computing Power: Growth Beyond Moore’s Law10分钟
Scaling Laws5分钟
Learning Curves15分钟
Model Capacity Required to Fit Data3分钟
Model Performance and Dataset Size2分钟
Model Performance and Model Capacity2分钟
Bias-Variance Trade-Off15分钟
From a Linear Algebra Perspective2分钟
Underdetermined Problems and Overparameterized Models8分钟
Revisiting Bias-Variance with Double Descent8分钟
Comparison of Learning Paradigms15分钟
A Learning Machine2分钟
How Do We Characterize Model Complexity?1分钟
Vapnik–Chervonenkis (VC) Dimension - Shattering10分钟
Notions of VC Dimension10分钟
Examples of Shattering and VC Dimension10分钟
VC Dimension in Neural Networks15分钟
Resources1分钟

2个作业总计60分钟

Calculating the VC Dimension of SVM Models30分钟
Module 2 Quiz30分钟

1个应用程序项目总计10分钟

Examples of Learning Machines10分钟

In this module, we’ll explore transfer learning and its role in data-efficient machine learning, where models leverage knowledge from previous tasks to improve performance on new, related tasks. We’ll also cover various types of transfer learning, including transductive, inductive, and unsupervised methods, each addressing different challenges and applications. We’ll discuss some practical steps for implementing transfer learning, such as selecting and fine-tuning pre-trained models, to reduce reliance on large datasets. We’ll also examine data-driven and physics-based simulations for data augmentation, highlighting their use in enhancing training under constrained conditions. Finally, we’ll review key papers on transfer learning techniques to address data scarcity and improve model performance.

涵盖的内容

1个视频15篇阅读材料1个作业

1个视频总计6分钟

Transfer Learning6分钟

15篇阅读材料总计72分钟

Data-efficient Machine Learning10分钟
Leveraging Pre-trained Models for Efficient Machine Learning2分钟
Vanilla Transfer Learning 2分钟
Types of Transfer Learning2分钟
Transductive Transfer Learning Algorithms10分钟
Inductive Transfer Learning Algorithms10分钟
Transductive Examples I5分钟
Transductive Examples II5分钟
Transductive Examples III5分钟
Inductive Examples5分钟
Multi-Task Learning & Meta-Learning5分钟
Synthetic Data Augmentation2分钟
Data-Driven Simulation3分钟
Physics-Based Simulation2分钟
Physics-Based Simulation Examples4分钟

1个作业总计15分钟

Module 3 Quiz15分钟

In this module, you'll explore the concept of domain adaptation, a key aspect of transductive transfer learning. Domain adaptation helps you train models that perform well on a target domain, even when its data distribution differs from the source domain. You'll learn about the challenges of domain shift and labeled data scarcity and how these can impact model performance. We'll cover different types of domain adaptation, including unsupervised, semi-supervised, and supervised approaches. You'll also dive into techniques like Deep Domain Confusion (DDC), which integrates domain confusion loss into neural networks to create domain-invariant features. Additionally, you'll discover advanced methods such as Domain-Adversarial Neural Networks (DANNs), Correlation Alignment (CORAL), and Deep Adaptation Networks (DANs) that build on DDC to enhance domain adaptation by aligning feature distributions and capturing complex dependencies across network layers.

涵盖的内容

1个视频10篇阅读材料1个作业

1个视频总计6分钟

Domain Adaptation6分钟

10篇阅读材料总计143分钟

Domain Adaptation: Background1分钟
Unsupervised, Semi-Supervised & Supervised10分钟
Deep Domain Confusion8分钟
Related Work Based on DDC2分钟
Deep Domain Confusion Architecture10分钟
Implementation & Architecture10分钟
Mathematical Formulation5分钟
An Example Dataset: Office-312分钟
An Example DDC Experiment5分钟
Transfer Learning Practice Activity90分钟

1个作业总计10分钟

Module 4 Quiz10分钟

In this module, we’ll explore weak supervision, a technique for training machine learning models with limited, noisy, or imprecise labels. You'll learn about different types of weak supervision and why they are crucial in small data domains. We’ll cover techniques such as semi-supervised learning, self-supervised learning, and active learning, along with advanced methods such as Temporal Ensembling and the Mean Teacher approach. Additionally, you'll discover Bayesian deep learning and active learning strategies to improve training efficiency. Finally, you'll see real-world applications in fields like medical imaging, NLP, fraud detection, autonomous driving, and biology.

涵盖的内容

1个视频8篇阅读材料1个作业

1个视频总计7分钟

What is Weak Supervision?7分钟

8篇阅读材料总计54分钟

Types of Weak Supervision6分钟
Semi-Supervised Learning10分钟
Self-Supervised Learning15分钟
Active Learning6分钟
Applications of Weak Supervision2分钟
Case Study: Medical Imaging5分钟
Case Study: Autonomous Driving5分钟
Case Study: Natural Language Processing5分钟

1个作业总计30分钟

Module 5 Quiz30分钟

In this module, you'll explore how Zero-Shot Learning (ZSL) enables models to recognize new categories without having seen any examples of those categories during training. This is achieved by leveraging intermediate semantic descriptions, such as attributes, shared between seen and unseen classes. You'll also learn about the importance of regularization in preventing overfitting and improving generalization, as well as how generative models like GANs and VAEs enhance ZSL by synthesizing unseen class data. Additionally, we'll examine Generalized Zero-Shot Learning (GZSL), which tests models on both seen and unseen classes, making the task more challenging and realistic. By the end of this module, you'll have a solid understanding of how ZSL and its extensions can be applied to various machine learning tasks.

涵盖的内容

1个视频9篇阅读材料1个作业

1个视频总计5分钟

Generalized Zero-Shot Learning5分钟

9篇阅读材料总计71分钟

Introduction to Zero-Shot Learning3分钟
ZSL: Notation and Problem Setup3分钟
Learning a Linear Predictor for Seen Classes10分钟
Problem Extension for ZSL: From Seen to Unseen Classes15分钟
An Embarrassingly Simple Approach to ZSL10分钟
ZSL with Generative Models10分钟
Generalized Zero-Shot Learning (GZSL)10分钟
Zero-Shot Learning: Semantic Autoencoders5分钟
Generalized ZSL With Generative Models5分钟

1个作业总计30分钟

Module 6 Quiz30分钟

This module focuses on Few-Shot Learning (FSL), a critical paradigm in machine learning that enables models to classify new examples with only a small number of labeled instances. Unlike traditional deep learning models that require vast amounts of labeled data, FSL mimics the human ability to generalize from limited examples, making it highly useful for tasks like image classification, object detection, and natural language processing (NLP). The lecture introduces Matching Networks, a metric-based learning approach designed to solve one-shot learning problems by learning a similarity function that maps new examples to previously seen labeled instances. Students will gain an in-depth understanding of how nearest-neighbor approaches, differentiable embedding functions, and attention mechanisms help in optimizing few-shot learning models. Through discussions, theoretical formulations, and real-world applications, this lecture equips students with practical insights into how AI can function effectively in data-scarce environments.

涵盖的内容

1个视频7篇阅读材料1个作业

1个视频总计6分钟

Introduction to Few-Shot Learning6分钟

7篇阅读材料总计46分钟

What is Few-Shot Learning?10分钟
Introduction to One-Shot Learning2分钟
Matching Networks: An Approach to One-Shot Learning10分钟
Training Matching Networks3分钟
Improving Few-Shot Visual Classification10分钟
Enhancing Few-Shot Image Classification With Unlabeled Examples10分钟
Congratulations1分钟

1个作业总计30分钟

Module 7 Quiz30分钟

位教师

Sarah Ostadabbas

Northeastern University

2 门课程317 名学生

提供方

Northeastern University

从 Machine Learning 浏览更多内容

状态：预览
Northeastern University
Machine Learning with Small Data Part 2
课程
状态：免费试用
Pearson
Learning Deep Learning: Unit 1
课程
状态：预览
O.P. Jindal Global University
Machine Learning
课程
状态：免费试用
Edge Impulse
Introduction to Embedded Machine Learning
课程

人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

自 2018开始学习的学生

''能够按照自己的速度和节奏学习课程是一次很棒的经历。只要符合自己的时间表和心情，我就可以学习。'

Jennifer J.

自 2020开始学习的学生

''我直接将从课程中学到的概念和技能应用到一个令人兴奋的新工作项目中。'

Larry W.

自 2021开始学习的学生

''如果我的大学不提供我需要的主题课程，Coursera 便是最好的去处之一。'

Chaitanya A.

''学习不仅仅是在工作中做的更好：它远不止于此。Coursera 让我无限制地学习。'

通过 Coursera Plus 开启新生涯

无限制访问 10,000+ 世界一流的课程、实践项目和就业就绪证书课程 - 所有这些都包含在您的订阅中

了解更多

通过在线学位推动您的职业生涯

获取世界一流大学的学位 - 100% 在线

探索学位

加入超过 3400 家选择 Coursera for Business 的全球公司

提升员工的技能，使其在数字经济中脱颖而出

了解更多

常见问题

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you purchase a Certificate you get access to all course materials, including graded assignments. Upon completing the course, your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.