This course is for ML engineers, solutions architects, and senior developers who build robust infrastructure powering large language models. This course teaches you how to design, deploy, and maintain the complex, interconnected systems required for scalable, resilient, and cost-effective LLM applications in the real world.
抓住节省的机会!购买 Coursera Plus 3 个月课程可享受40% 的折扣,并可完全访问数千门课程。

Designing Production LLM Architectures
包含在 中
您将学到什么
Compare synchronous and asynchronous architectures and apply 12-factor principles and container orchestration to deploy scalable microservices.
Analyze multi-region deployments, pinpoint latency bottlenecks, and design resilient architecture improvements via fault analysis.
Create Airflow DAGs to automate data workflows and analyze the impact of schema evolution on downstream processes and tests.
Analyze trade-offs between self-hosting models vs. managed APIs and evaluate proposed infrastructure for fault tolerance and cost.
您将获得的技能
要了解的详细信息
了解顶级公司的员工如何掌握热门技能

积累特定领域的专业知识
- 向行业专家学习新概念
- 获得对主题或工具的基础理解
- 通过实践项目培养工作相关技能
- 获得可共享的职业证书

该课程共有5个模块
This module empowers engineers and architects to master the "build vs. buy" decision for LLM applications through a structured, strategic lens. You will learn to design complex system architectures using sequence diagrams to evaluate synchronous and asynchronous processing, while comparing the trade-offs of self-hosted open-source models against managed APIs. By focusing on critical metrics like Total Cost of Ownership (TCO), latency, and data privacy, you will develop the expertise to justify architectural choices. Ultimately, you'll gain the confidence to document and defend high-performance, business-aligned AI solutions to any stakeholder.
涵盖的内容
4个视频2篇阅读材料3个作业
This module explores building resilient, scalable architectures for LLM applications. You will apply 12-factor app methodology to design portable, cloud-native microservices, mastering stateless design and dependency management. The curriculum bridges theory and practice by evaluating multi-region deployment strategies for fault tolerance and high availability. You'll learn to analyze failover mechanisms and mitigate architectural risks before production. By the end, you’ll be equipped to document reliable, future-proof AI systems. Prerequisites include a foundational understanding of cloud concepts (regions/zones) and microservice basics (containers/APIs).
涵盖的内容
1个视频1篇阅读材料3个作业
This module teaches how to transition LLM prototypes into production-grade services. You will learn to analyze multi-stage architectures like RAG to identify and quantify performance bottlenecks using evidence-based metrics. The curriculum focuses on mastering Kubernetes deployment through declarative Helm charts and implementing Horizontal Pod Autoscaling (HPA) to manage unpredictable traffic. By studying deployment lifecycles, including controlled rollouts and rapid rollbacks, you will gain the skills to transform fragile prototypes into resilient, scalable, and reliable production systems capable of handling real-world loads.
涵盖的内容
5个视频5篇阅读材料6个作业
In today's dynamic data landscape, pipelines often break when source data structures change unexpectedly—a problem known as schema drift. This module tackles that challenge head-on, teaching you how to design and automate data pipelines that can gracefully handle schema evolution using Apache Airflow. By the end, you will be equipped to create resilient, scalable, and fully automated data pipelines that are built to withstand the complexities of real-world data environments.
涵盖的内容
5个视频5篇阅读材料7个作业
In the module, you will step into the high-stakes role of a senior systems engineer tasked with diagnosing a failing AI service. A critical Retrieval-Augmented Generation (RAG) system is plagued by high latency and intermittent outages, and you must get to the root of the problem. Using architectural diagrams, system logs, and performance metrics, you will analyze the system’s design to identify the primary performance bottleneck and the most significant single point of failure. Your analysis will culminate in a concise, two-paragraph report for stakeholders, pinpointing the critical issues and recommending targeted fixes to restore stability and performance.
涵盖的内容
2篇阅读材料1个作业
获得职业证书
将此证书添加到您的 LinkedIn 个人资料、简历或履历中。在社交媒体和绩效考核中分享。
位教师

提供方
从 Design and Product 浏览更多内容
状态:免费试用
状态:免费试用
状态:免费试用
状态:免费试用
人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

Jennifer J.

Larry W.

Chaitanya A.
常见问题
This course assumes hands-on experience with cloud concepts and containers. If you are new to cloud platforms or Kubernetes, first complete a foundational cloud or container course to gain the most from these modules.
You will work with sequence diagrams, Kubernetes manifests and Helm charts, Airflow DAGs, and cloud deployment patterns. Labs use common cloud and orchestration tooling; no proprietary vendor lock-in is required.
The course provides structured analysis and decision criteria—latency, total cost of ownership, data privacy, and operational complexity—so you can compare options and make informed architecture choices for your use case.
更多问题
提供助学金,
¹ 本课程的部分作业采用 AI 评分。对于这些作业,将根据 Coursera 隐私声明使用您的数据。




