Building and Deploying Generative AI Models

Building and Deploying Generative AI Models

本课程是 Generative AI Fundamentals 专项课程的一部分

位教师：Amreen Anbar

访问权限由 New York State Department of Labor 提供

3个模块

深入了解一个主题并学习基础知识。

中级等级

推荐体验

1 周完成

在 10 小时一周

灵活的计划

自行安排学习进度

3个模块

深入了解一个主题并学习基础知识。

中级等级

推荐体验

1 周完成

在 10 小时一周

灵活的计划

自行安排学习进度

您将学到什么

Construct and evaluate Transformer-based LLMs from scratch using PyTorch and industry metrics like ROUGE and BLEU.
Engineer Retrieval Augmented Generation (RAG) pipelines using LangChain to integrate current, domain-specific knowledge into models.
Deploy autonomous AI Agents to production environments on Google Cloud Platform (Vertex AI) using professional workflows.

您将获得的技能

您将学习的工具

要了解的详细信息

可分享的证书

添加到您的领英档案

作业

3 项作业

授课语言：英语（English）

了解顶级公司的员工如何掌握热门技能

了解关于 Coursera for Business 的更多信息

Petrobras, TATA, Danone, Capgemini, P&G 和 L'Oreal 的徽标

积累特定领域的专业知识

本课程是 Generative AI Fundamentals 专项课程专项课程的一部分

在注册此课程时，您还会同时注册此专项课程。

向行业专家学习新概念
获得对主题或工具的基础理解
通过实践项目培养工作相关技能
获得可共享的职业证书

该课程共有3个模块

Transition from theoretical concepts to production-ready engineering in this hands-on course which is the final part in "Fundamentals of Generative AI" specialization. Designed for learners ready to move beyond the theory, this course focuses entirely on construction: you won't just learn about Large Language Models (LLMs); you will build, refine, and deploy them.

We start at the foundational level, coding different types of Transformer architectures from scratch using PyTorch. Through high-performance training with Automatic Mixed Precision and ROUGE/BLEU evaluation, you will learn the techniques to scale custom components into optimized systems. By utilizing pre-trained models and weighing performance trade-offs, you will gain the insight needed to select the most efficient path for large-scale deployment. Moving to applied architecture, you will master Retrieval Augmented Generation (RAG) using LangChain, learning to evaluate pipelines and apply advanced techniques such as different chunking strategies, reranking and compression, and query transformation. You'll also navigate model selection as well as the critical trade-offs between RAG and Fine-tuning. Finally, you will step into the future of AI by developing autonomous Agents. You will bridge the gap between development and production by setting up a professional workflow with Poetry and deploying a Summarizer AI Agent directly to the Google Cloud Platform (Vertex AI). By the end of this course, you will possess a tangible portfolio of code and a live deployment, proving your ability to engineer robust Generative AI solutions.

In this module, we dive deep into the Transformer architecture, its core mechanics, and different transformer architecture types (encoder-only, decoder-only, encoder-decoder). We gain hands-on experience by building and training a complete suite of PyTorch-based models from scratch. The module concludes with strategic deployment skills, teaching when to build custom models versus leveraging pre-trained models for efficiency and state-of-the-art results.

涵盖的内容

18个视频11篇阅读材料1个作业

18个视频总计113分钟

Course Introduction4分钟
Meet your instructor: Amreen Anbar1分钟
Meet your instructor: Anahita Doosti1分钟
Meet your instructor: Soroush Razavi1分钟
Transformer: Evolution Unveiled8分钟
Transformer: Types8分钟
Transformer: The Components7分钟
Setting The Stage: Environment, Libraries and Data8分钟
Looking beyond theory: Let’s Build a Transformer!9分钟
Looking beyond theory: Training and Text Generation8分钟
Building the Complete Encoder-Decoder Summarizer: Encoder, Decoder, and the Cross-Attention Mechanism7分钟
Building the Complete Encoder-Decoder Summarizer: Teacher Forcing, Loss, and Inference7分钟
Scaling the Architecture: From Character Tokens to BPE and Massive Data8分钟
Scaling the Architecture: High-Performance Optimization (AMP) and ROUGE Evaluation9分钟
Synthesis: Implementation of the Translator Transformer9分钟
Bypass the Training Wall: Powerful LLM Applications Without Massive Compute5分钟
A Resource-Efficient Approach: Using pre-trained models for Summarization 6分钟
A Resource-Efficient Approach: Using Pre-trained Models for Translation8分钟

11篇阅读材料总计290分钟

The original paper, "Attention Is All You Need"20分钟
Interactive Transformer Explainer30分钟
Notebook 140分钟
Notebook 240分钟
Notebook 340分钟
Dataset (cnn_dailymail)10分钟
Notebook 440分钟
Dataset (wmt14)10分钟
ROUGE and BLEU Score for NLP Evaluation20分钟
Notebook 520分钟
Notebook 620分钟

1个作业总计30分钟

Section 1 Quiz30分钟

Module 2 addresses the limitations of static knowledge and hallucinations in Large Language Models (LLMs) by introducing Retrieval Augmented Generation (RAG). Learners will progress from building fundamental pipelines with Ollama and LangChain to implementing production-ready systems by adding rigorous RAG evaluation and utilizing advanced techniques such as custom chunking strategies, vector stores, reranking, and query transformations to optimize context retrieval and response generation. The module concludes with an overview of another adaptation technique called finetuning and a comparison of RAG vs. finetuning.

涵盖的内容

13个视频2篇阅读材料1个作业

13个视频总计85分钟

What is RAG?6分钟
Building a Minimal RAG from Scratch with Ollama (Part 1)7分钟
Building a Minimal RAG from Scratch with Ollama (Part 2)5分钟
An Improved RAG Pipeline with LangChain7分钟
RAG Evaluation and Metrics7分钟
Implementing RAG Evaluation7分钟
Document Loaders and Chunking Strategies6分钟
Vector Stores and Indexing6分钟
Reranking and Contextual Compression7分钟
Query Transformation7分钟
Pick the Right Models for your RAG7分钟
What is Finetuning?5分钟
RAG vs. Finetuning: Which one to choose?7分钟

2篇阅读材料总计140分钟

Coding Notebooks 20分钟
Final RAG Results 120分钟

1个作业总计30分钟

Section 2 Quiz30分钟

Module 3 marks a pivotal transition from passive information retrieval to the dynamic realm of autonomous AI Agents, anchored by the "Understand, Think, Take Action" conceptual framework. Students will critically evaluate development ecosystems before applying these concepts to build a functional Summarizer Agent. The module emphasizes professional engineering standards, guiding learners through a complete lifecycle that includes environment management with Poetry, deployment to the Vertex AI Engine, and the implementation of robust performance monitoring using Google Cloud Platform’s logging and tracing tools.