Secure AI: Red-Teaming & Safety Filters

kurs ist nicht verfügbar in Deutsch (Deutschland)

Wir übersetzen es in weitere Sprachen.

Secure AI: Red-Teaming & Safety Filters

Dieser Kurs ist Teil von Spezialisierung „AI Security: Security in the Age of Artificial Intelligence“

Dozenten: Brian Newman

Bei enthalten

Mehr erfahren

3 Module

Verschaffen Sie sich einen Einblick in ein Thema und lernen Sie die Grundlagen.

Stufe Mittel

Empfohlene Erfahrung

4 Stunden zu vervollständigen

Flexibler Zeitplan

In Ihrem eigenen Lerntempo lernen

3 Module

Verschaffen Sie sich einen Einblick in ein Thema und lernen Sie die Grundlagen.

Stufe Mittel

Empfohlene Erfahrung

4 Stunden zu vervollständigen

Flexibler Zeitplan

In Ihrem eigenen Lerntempo lernen

Was Sie lernen werden

Design red-teaming scenarios to identify vulnerabilities and attack vectors in large language models using structured adversarial testing.
Implement content-safety filters to detect and mitigate harmful outputs while maintaining model performance and user experience.
Evaluate and enhance LLM resilience by analyzing adversarial inputs and developing defense strategies to strengthen overall AI system security.

Kompetenzen, die Sie erwerben

Kategorie: Security Testing
Kategorie: Vulnerability Scanning
Kategorie: AI Personalization
Kategorie: Continuous Monitoring
Kategorie: Responsible AI
Kategorie: System Implementation
Kategorie: LLM Application
Kategorie: Exploitation techniques
Kategorie: Large Language Modeling
Kategorie: Security Controls
Kategorie: AI Security
Kategorie: Cyber Security Assessment
Kategorie: Vulnerability Assessments
Kategorie: Threat Modeling
Kategorie: Security Strategy

Werkzeuge, die Sie lernen werden

Kategorie: Prompt Engineering

Wichtige Details

Zertifikat zur Vorlage

Zu Ihrem LinkedIn-Profil hinzufügen

Kürzlich aktualisiert!

Dezember 2025

Bewertungen

1 Zuweisung¹

KI-bewertet siehe Haftungsausschluss

Unterrichtet in Englisch

Erfahren Sie, wie Mitarbeiter führender Unternehmen gefragte Kompetenzen erwerben.

Weitere Informationen zu Coursera für Unternehmen

Logos von Petrobras, TATA, Danone, Capgemini, P&G und L'Oreal

Erweitern Sie Ihre Fachkenntnisse

Dieser Kurs ist Teil der Spezialisierung Spezialisierung „AI Security: Security in the Age of Artificial Intelligence“

Wenn Sie sich für diesen Kurs anmelden, werden Sie auch für diese Spezialisierung angemeldet.

Lernen Sie neue Konzepte von Branchenexperten
Gewinnen Sie ein Grundverständnis bestimmter Themen oder Tools
Erwerben Sie berufsrelevante Kompetenzen durch praktische Projekte
Erwerben Sie ein Berufszertifikat zur Vorlage

In diesem Kurs gibt es 3 Module

As large language models revolutionize business operations, sophisticated attackers exploit AI systems through prompt injection, jailbreaking, and content manipulation—vulnerabilities that traditional security tools cannot detect. This intensive course empowers AI developers, cybersecurity professionals, and IT managers to systematically identify and mitigate LLM-specific threats before deployment. Master red-teaming methodologies using industry-standard tools like PyRIT, NVIDIA Garak, and Promptfoo to uncover hidden vulnerabilities through adversarial testing. Learn to design and implement multi-layered content-safety filters that block sophisticated bypass attempts while maintaining system functionality. Through hands-on labs, you'll establish resilience baselines, implement continuous monitoring systems, and create adaptive defenses that strengthen over time.

This course is designed for AI engineers, security professionals, data scientists, and developers interested in ensuring the safety and robustness of AI models. It’s also ideal for technology leaders seeking to implement secure, responsible AI frameworks within their organizations. Learners should have a basic understanding of machine learning, AI model architecture, and programming concepts. No prior experience with AI red-teaming or safety systems is required. By end of this course, you'll confidently conduct professional AI security assessments, deploy robust safety mechanisms, and protect LLM applications from evolving attack vectors in production environments.

This module introduces participants to the systematic creation and execution of red-teaming scenarios targeting large language models. Students learn to identify common vulnerability categories including prompt injection, jailbreaking, and data extraction attacks. The module demonstrates how to design realistic adversarial scenarios that mirror real-world attack patterns, using structured methodologies to probe LLM weaknesses. Hands-on demonstrations show how red-teamers simulate malicious user behavior to uncover security gaps before deployment.

Das ist alles enthalten

4 Videos2 Lektüren1 peer review

4 VideosInsgesamt 27 Minuten

Welcome to Secure AI Red-Teaming & Safety Filters3 Minuten
Understanding AI Attack Vectors and Vulnerability Categories5 Minuten
Designing Effective Red-Teaming Scenarios7 Minuten
Hands-On Vulnerability Discovery with Automated Tools13 Minuten

2 LektürenInsgesamt 10 Minuten

Welcome to the Course: Course Overview5 Minuten
LLM Red Teaming Guide (Open Source): Systematically Testing Large Language Models for Vulnerabilities5 Minuten

1 peer reviewInsgesamt 15 Minuten

Hands-On-Learning: Red-Team Assessment of ChatAssist Customer Service Bot15 Minuten

This module covers the design, implementation, and evaluation of content-safety filters for LLM applications. Participants explore multi-layered defense strategies including input sanitization, output filtering, and behavioral monitoring systems. The module demonstrates how to configure safety mechanisms that balance security with functionality, and shows practical testing methods to validate filter effectiveness against sophisticated bypass attempts. Real-world examples illustrate the challenges of maintaining robust content filtering while preserving user experience.

Das ist alles enthalten

3 Videos1 Lektüre1 peer review

3 VideosInsgesamt 25 Minuten

Multi-Layered Content-Safety Filter Architecture7 Minuten
Implementing and Configuring Safety Filters for Production8 Minuten
Testing Filter Effectiveness Against Bypass Attempts10 Minuten

1 LektüreInsgesamt 5 Minuten

The Landscape of LLM Guardrails: Intervention Levels and Techniques5 Minuten

1 peer reviewInsgesamt 20 Minuten

Hands-On-Learning: Safety Filter Implementation for SecureChat Enterprise Bot20 Minuten

This module focuses on comprehensive resilience testing and systematic improvement of AI system robustness. Students learn to conduct thorough security assessments that measure LLM resistance to adversarial inputs, evaluate defense mechanism effectiveness, and identify areas for improvement. The module demonstrates how to establish baseline security metrics, implement iterative hardening processes, and validate improvements through continuous testing. Participants gain skills in developing robust AI systems that maintain integrity under real-world adversarial conditions.

Das ist alles enthalten

4 Videos1 Lektüre1 Aufgabe2 peer reviews

4 VideosInsgesamt 31 Minuten

Establishing Baseline Security Metrics and Resilience Benchmarks6 Minuten
Continuous Testing and Automated Vulnerability Assessment7 Minuten
Systematic Security Improvement and Adaptive Hardening15 Minuten
Course Wrap-Up3 Minuten

1 LektüreInsgesamt 5 Minuten

10 LLM Security Tools to Know in 20255 Minuten

1 AufgabeInsgesamt 20 Minuten

Secure AI: Red-Teaming & Safety Filters20 Minuten

2 peer reviewsInsgesamt 80 Minuten

Hands-On-Learning: Resilience Assessment and Continuous Hardening of DataSecure AI Assistant20 Minuten
Project: SecureBank AI Chatbot Security Audit & Implementation 60 Minuten

Erwerben Sie ein Karrierezertifikat.

Fügen Sie dieses Zeugnis Ihrem LinkedIn-Profil, Lebenslauf oder CV hinzu. Teilen Sie sie in Social Media und in Ihrer Leistungsbeurteilung.

Dozenten

Brian Newman

Coursera

5 Kurse2.311 Lernende

von

Coursera

Mehr von Computer Security and Networks entdecken

Board Infinity
AI Risk and Compliance: Audit and Governance Foundations
Kurs
Edureka
Generative AI and LLM Security
Kurs
Pearson
Securing Generative AI
Kurs
Macquarie University
Adversarial AI: Attacking, Defending & Governing ML Systems
Kurs

Warum entscheiden sich Menschen für Coursera für ihre Karriere?

Felipe M.

Lernender seit 2018

„Es ist eine großartige Erfahrung, in meinem eigenen Tempo zu lernen. Ich kann lernen, wenn ich Zeit und Nerven dazu habe.“

Jennifer J.

Lernender seit 2020

„Bei einem spannenden neuen Projekt konnte ich die neuen Kenntnisse und Kompetenzen aus den Kursen direkt bei der Arbeit anwenden.“

Larry W.

Lernender seit 2021

„Wenn mir Kurse zu Themen fehlen, die meine Universität nicht anbietet, ist Coursera mit die beste Alternative.“

Chaitanya A.

„Man lernt nicht nur, um bei der Arbeit besser zu werden. Es geht noch um viel mehr. Bei Coursera kann ich ohne Grenzen lernen.“

Häufig gestellte Fragen

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.

Weitere Fragen

Besuchen Sie die das Hilfe-Center für Kursteilnehmer.

Finanzielle Unterstützung verfügbar,

¹ Einige Aufgaben in diesem Kurs werden mit AI bewertet. Für diese Aufgaben werden Ihre Daten in Übereinstimmung mit Datenschutzhinweis von Courseraverwendet.