Programming Generative AI: Unit 3

Ce cours n'est pas disponible en Français (France)

Nous sommes actuellement en train de le traduire dans plus de langues.

Programming Generative AI: Unit 3

Ce cours fait partie de Spécialisation "Programming Generative AI"

Instructeurs : Pearson

Inclus avec

1 module

Obtenez un aperçu d'un sujet et apprenez les principes fondamentaux.

niveau Intermédiaire

Expérience recommandée

8 heures à compléter

Planning flexible

Apprenez à votre propre rythme

1 module

Obtenez un aperçu d'un sujet et apprenez les principes fondamentaux.

niveau Intermédiaire

Expérience recommandée

8 heures à compléter

Planning flexible

Apprenez à votre propre rythme

Ce que vous apprendrez

Understand and implement multimodal models that integrate images and text for advanced AI applications.
Build and optimize semantic image search engines using contrastive language-image pre-training.
Master the principles and practicalities of latent diffusion and stable diffusion for text-to-image generation.
Adapt, fine-tune, and efficiently evaluate pre-trained generative models for new tasks, styles, and real-time performance.

Compétences que vous acquerrez

Catégorie : Multimodal Prompts
Catégorie : Transfer Learning
Catégorie : Embeddings
Catégorie : Performance Tuning
Catégorie : Image Analysis
Catégorie : Generative Model Architectures
Catégorie : Computer Vision
Catégorie : Model Evaluation

Outils que vous découvrirez

Catégorie : Generative AI

Détails à connaître

Certificat partageable

Ajouter à votre profil LinkedIn

Évaluations

3 devoirs

Enseigné en Anglais

Découvrez comment les employés des entreprises prestigieuses maîtrisent des compétences recherchées

En savoir plus sur Coursera pour les affaires

logos de Petrobras, TATA, Danone, Capgemini, P&G et L'Oreal

Élaborez votre expertise du sujet

Ce cours fait partie de la Spécialisation "Programming Generative AI"

Lorsque vous vous inscrivez à ce cours, vous êtes également inscrit(e) à cette Spécialisation.

Apprenez de nouveaux concepts auprès d'experts du secteur
Acquérez une compréhension de base d'un sujet ou d'un outil
Développez des compétences professionnelles avec des projets pratiques
Obtenez un certificat professionnel partageable

Il y a un module dans ce cours

Unlock the full potential of generative AI with our advanced course module focused on state-of-the-art multimodal models. This course is designed for learners eager to bridge the gap between images and text, and to master the latest techniques in AI-driven content generation. You’ll begin by exploring the foundational concepts behind multimodal models, learning how contrastive language-image pre-training enables seamless integration of visual and textual data. Discover how these models power innovative applications like semantic image search, allowing you to query image content without manual labeling. Dive deeper into the mechanics of latent diffusion models and unravel the inner workings of stable diffusion, gaining the skills to transform text prompts into entirely new, never-before-seen images. The course also covers essential strategies for evaluating generative models and introduces efficient methods for fine-tuning and adapting pre-trained models to new styles and subjects. By the end, you’ll be equipped to build, adapt, and optimize cutting-edge text-to-image systems—ready to innovate in creative, research, or commercial settings.

This module delves into multimodal generative AI, focusing on models that connect images and text. Learners explore contrastive language-image pre-training for semantic image search and uncover the workings of latent diffusion and stable diffusion for text-to-image generation. The module then covers evaluation of generative models, parameter-efficient fine-tuning, and techniques to teach pre-trained models new styles and subjects. It concludes with methods to optimize diffusion models for faster, near real-time image generation, equipping students with both conceptual understanding and practical skills in advanced multimodal AI systems.

Inclus

44 vidéos3 devoirs

44 vidéos Total 408 minutes

Topics 1 minute
Components of a Multimodal Model 5 minutes
Vision-Language Understanding 10 minutes
Contrastive Language-Image Pretraining 6 minutes
Embedding Text and Images with CLIP 14 minutes
Zero-Shot Image Classification with CLIP 4 minutes
Semantic Image Search with CLIP 11 minutes
Conditional Generative Models 5 minutes
Introduction to Latent Diffusion Models 9 minutes
The Latent Diffusion Model Architecture 6 minutes
Failure Modes and Additional Tools 7 minutes
Stable Diffusion Deconstructed 12 minutes
Writing Our Own Stable Diffusion Pipeline 11 minutes
Decoding Images from the Stable Diffusion Latent Space 5 minutes
Improving Generation with Guidance 9 minutes
Playing with Prompts 30 minutes
Topics 1 minute
Methods and Metrics for Evaluating Generative AI 7 minutes
Manual Evaluation of Stable Diffusion with DrawBench 14 minutes
Quantitative Evaluation of Diffusion Models with Human Preference Predictors 20 minutes
Overview of Methods for Fine-Tuning Diffusion Models 10 minutes
Sourcing and Preparing Image Datasets for Fine-Tuning 8 minutes
Generating Automatic Captions with BLIP-2 8 minutes
Parameter Efficient Fine-Tuning with LoRA 12 minutes
Inspecting the Results of Fine-Tuning 5 minutes
Inference with LoRAs for Style-Specific Generation 12 minutes
Conceptual Overview of Textual Inversion 8 minutes
Subject-Specific Personalization with Dreambooth 8 minutes
Dreambooth versus LoRA Fine-Tuning 6 minutes
Dreambooth Fine-Tuning with Hugging Face 14 minutes
Inference with Dreambooth to Create Personalized AI Avatars 14 minutes
Adding Conditional Control to Text-to-Image Diffusion Models 4 minutes
Creating Edge and Depth Maps for Conditioning 16 minutes
Depth and Edge-Guided Stable Diffusion with ControlNet 17 minutes
Understanding and Experimenting with ControlNet Parameters 9 minutes
Generative Text Effects with Font Depth Maps 3 minutes
Few Step Generation with Adversarial Diffusion Distillation (ADD) 7 minutes
Reasons to Distill 6 minutes
Comparing SDXL and SDXL Turbo 12 minutes
Text-Guided Image-to-Image Translation 17 minutes
Video-Driven Frame-by-Frame Generation with SDXL Turbo 13 minutes
Near Real-Time Inference with PyTorch Performance Optimizations 11 minutes
Programming Generative AI: Summary 1 minute
Course Summary 1 minute

3 devoirs Total 90 minutes

Connecting Text and Images Quiz 30 minutes
Post-Training Procedures for Diffusion Models Quiz 30 minutes
End of Assessment Quiz 30 minutes

Obtenez un certificat professionnel

Ajoutez ce titre à votre profil LinkedIn, à votre curriculum vitae ou à votre CV. Partagez-le sur les médias sociaux et dans votre évaluation des performances.

Instructeurs

Pearson

268 Cours 47 305 apprenants

Offert par

Pearson

En savoir plus sur Software Development

Pearson
Programming Generative AI: Unit 2
Cours
Pearson
Programming Generative AI
Spécialisation
Pearson
Programming Generative AI: Unit 1
Cours
Pearson
Generative AI for Developers: Unit 3
Cours

Pour quelles raisons les étudiants sur Coursera nous choisissent-ils pour leur carrière ?

Felipe M.

Étudiant(e) depuis 2018

’Pouvoir suivre des cours à mon rythme à été une expérience extraordinaire. Je peux apprendre chaque fois que mon emploi du temps me le permet et en fonction de mon humeur.’

Jennifer J.

Étudiant(e) depuis 2020

’J'ai directement appliqué les concepts et les compétences que j'ai appris de mes cours à un nouveau projet passionnant au travail.’

Larry W.

Étudiant(e) depuis 2021

’Lorsque j'ai besoin de cours sur des sujets que mon université ne propose pas, Coursera est l'un des meilleurs endroits où se rendre.’

Chaitanya A.

’Apprendre, ce n'est pas seulement s'améliorer dans son travail : c'est bien plus que cela. Coursera me permet d'apprendre sans limites.’

Ouvrez de nouvelles portes avec Coursera Plus

Accès illimité à 10,000+ cours de niveau international, projets pratiques et programmes de certification prêts à l'emploi - tous inclus dans votre abonnement.

Faites progresser votre carrière avec un diplôme en ligne

Obtenez un diplôme auprès d’universités de renommée mondiale - 100 % en ligne

Découvrir les diplômes

Rejoignez plus de 3 400 entreprises mondiales qui ont choisi Coursera pour les affaires

Améliorez les compétences de vos employés pour exceller dans l’économie numérique

Foire Aux Questions

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.

When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.