Packt

Cutting-Edge Topics in Deep Reinforcement Learning

Packt

Cutting-Edge Topics in Deep Reinforcement Learning

包含在 Coursera Plus

深入了解一个主题并学习基础知识。
高级设置 等级

推荐体验

7 小时 完成
灵活的计划
自行安排学习进度
深入了解一个主题并学习基础知识。
高级设置 等级

推荐体验

7 小时 完成
灵活的计划
自行安排学习进度

您将学到什么

  • Understand continuous action spaces and their applications in deep reinforcement learning

  • Master trust region methods for stable policy optimization in RL

  • Explore black-box optimization techniques to solve complex RL problems

要了解的详细信息

可分享的证书

添加到您的领英档案

最近已更新!

April 2026

作业

8 项作业

授课语言:英语(English)

了解顶级公司的员工如何掌握热门技能

Petrobras, TATA, Danone, Capgemini, P&G 和 L'Oreal 的徽标

积累特定领域的专业知识

本课程是 Deep Reinforcement Learning Hands-On 专项课程 专项课程的一部分
在注册此课程时,您还会同时注册此专项课程。
  • 向行业专家学习新概念
  • 获得对主题或工具的基础理解
  • 通过实践项目培养工作相关技能
  • 获得可共享的职业证书

该课程共有8个模块

This module introduces advanced reinforcement learning techniques for environments with continuous action spaces. Learners will explore the A2C method, analyze its performance, and implement practical solutions for training agents in such domains. Hands-on coding examples and experimental results will deepen understanding of policy gradient methods in continuous settings.

涵盖的内容

1个视频5篇阅读材料1个作业

This module explores advanced techniques for stabilizing policy gradient methods in deep reinforcement learning. Learners will compare and contrast Proximal Policy Optimization (PPO), Trust Region Policy Optimization (TRPO), and ACKTR, examining their theoretical foundations and practical performance. By the end, you will understand how these methods improve training stability and efficiency.

涵盖的内容

1个视频4篇阅读材料1个作业

This module introduces black-box optimization techniques in reinforcement learning, highlighting their principles and recent applications to complex environments. Learners will explore practical implementations using evolutionary strategies and genetic algorithms, and analyze performance results on benchmark tasks such as CartPole and HalfCheetah.

涵盖的内容

1个视频4篇阅读材料1个作业

This module delves into advanced exploration strategies in reinforcement learning, highlighting the exploration/exploitation dilemma and presenting alternative methods such as random exploration, noisy networks, and network distillation. Learners will experiment with these techniques in the MountainCar environment and compare their effectiveness using both DQN and PPO algorithms.

涵盖的内容

1个视频6篇阅读材料1个作业

This module introduces reinforcement learning with human feedback (RLHF), a technique for training agents when explicit reward functions are difficult to define. Learners will explore the RLHF pipeline, including data labeling, reward model training, and integration with reinforcement learning algorithms. Real-world applications, such as training large language models, are also discussed.

涵盖的内容

1个视频6篇阅读材料1个作业

This module explores advanced model-based reinforcement learning techniques through the lens of AlphaGo Zero and MuZero. Learners will examine Monte Carlo Tree Search (MCTS), neural network architectures, and the process of training agents for board games like Connect 4. Practical implementation details and evaluation strategies are also covered.

涵盖的内容

1个视频11篇阅读材料1个作业

This module explores how deep reinforcement learning techniques can be applied to discrete optimization problems, using the example of solving cubes. Learners will examine neural network architectures, training processes, and experimental results, gaining insight into both implementation and evaluation of RL-based solvers.

涵盖的内容

1个视频5篇阅读材料1个作业

This module introduces the fundamentals of multi-agent reinforcement learning (MARL), exploring how multiple agents interact and learn within shared environments. Learners will examine the application of deep Q-networks to groups of agents and analyze the resulting behaviors. Practical examples illustrate how agent strategies evolve in multi-agent scenarios.

涵盖的内容

1个视频2篇阅读材料1个作业

获得职业证书

将此证书添加到您的 LinkedIn 个人资料、简历或履历中。在社交媒体和绩效考核中分享。

位教师

Packt - Course Instructors
Packt
1,749 门课程492,078 名学生

提供方

Packt

从 Software Development 浏览更多内容

人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

自 2018开始学习的学生
''能够按照自己的速度和节奏学习课程是一次很棒的经历。只要符合自己的时间表和心情,我就可以学习。'

Jennifer J.

自 2020开始学习的学生
''我直接将从课程中学到的概念和技能应用到一个令人兴奋的新工作项目中。'

Larry W.

自 2021开始学习的学生
''如果我的大学不提供我需要的主题课程,Coursera 便是最好的去处之一。'

Chaitanya A.

''学习不仅仅是在工作中做的更好:它远不止于此。Coursera 让我无限制地学习。'
Coursera Plus

通过 Coursera Plus 开启新生涯

无限制访问 10,000+ 世界一流的课程、实践项目和就业就绪证书课程 - 所有这些都包含在您的订阅中

通过在线学位推动您的职业生涯

获取世界一流大学的学位 - 100% 在线

加入超过 3400 家选择 Coursera for Business 的全球公司

提升员工的技能,使其在数字经济中脱颖而出

常见问题