As large language models revolutionize business operations, sophisticated attackers exploit AI systems through prompt injection, jailbreaking, and content manipulation—vulnerabilities that traditional security tools cannot detect. This intensive course empowers AI developers, cybersecurity professionals, and IT managers to systematically identify and mitigate LLM-specific threats before deployment. Master red-teaming methodologies using industry-standard tools like PyRIT, NVIDIA Garak, and Promptfoo to uncover hidden vulnerabilities through adversarial testing. Learn to design and implement multi-layered content-safety filters that block sophisticated bypass attempts while maintaining system functionality. Through hands-on labs, you'll establish resilience baselines, implement continuous monitoring systems, and create adaptive defenses that strengthen over time.

您将学到什么
Design red-teaming scenarios to identify vulnerabilities and attack vectors in large language models using structured adversarial testing.
Implement content-safety filters to detect and mitigate harmful outputs while maintaining model performance and user experience.
Evaluate and enhance LLM resilience by analyzing adversarial inputs and developing defense strategies to strengthen overall AI system security.
您将获得的技能
- Cyber Security Assessment
- Scenario Testing
- Prompt Engineering
- AI Security
- Vulnerability Scanning
- AI Personalization
- LLM Application
- Security Testing
- Continuous Monitoring
- Penetration Testing
- Responsible AI
- System Implementation
- Threat Modeling
- Security Strategy
- Vulnerability Assessments
- Security Controls
- Large Language Modeling
- 技能部分已折叠。显示 9 项技能,共 17 项。
要了解的详细信息
了解顶级公司的员工如何掌握热门技能

积累特定领域的专业知识
- 向行业专家学习新概念
- 获得对主题或工具的基础理解
- 通过实践项目培养工作相关技能
- 获得可共享的职业证书

该课程共有3个模块
This module introduces participants to the systematic creation and execution of red-teaming scenarios targeting large language models. Students learn to identify common vulnerability categories including prompt injection, jailbreaking, and data extraction attacks. The module demonstrates how to design realistic adversarial scenarios that mirror real-world attack patterns, using structured methodologies to probe LLM weaknesses. Hands-on demonstrations show how red-teamers simulate malicious user behavior to uncover security gaps before deployment.
涵盖的内容
4个视频2篇阅读材料1次同伴评审
This module covers the design, implementation, and evaluation of content-safety filters for LLM applications. Participants explore multi-layered defense strategies including input sanitization, output filtering, and behavioral monitoring systems. The module demonstrates how to configure safety mechanisms that balance security with functionality, and shows practical testing methods to validate filter effectiveness against sophisticated bypass attempts. Real-world examples illustrate the challenges of maintaining robust content filtering while preserving user experience.
涵盖的内容
3个视频1篇阅读材料1次同伴评审
This module focuses on comprehensive resilience testing and systematic improvement of AI system robustness. Students learn to conduct thorough security assessments that measure LLM resistance to adversarial inputs, evaluate defense mechanism effectiveness, and identify areas for improvement. The module demonstrates how to establish baseline security metrics, implement iterative hardening processes, and validate improvements through continuous testing. Participants gain skills in developing robust AI systems that maintain integrity under real-world adversarial conditions.
涵盖的内容
4个视频1篇阅读材料1个作业2次同伴评审
获得职业证书
将此证书添加到您的 LinkedIn 个人资料、简历或履历中。在社交媒体和绩效考核中分享。
提供方
人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

Jennifer J.

Larry W.

Chaitanya A.
从 Computer Science 浏览更多内容
¹ 本课程的部分作业采用 AI 评分。对于这些作业,将根据 Coursera 隐私声明使用您的数据。








