Does this program cover the cost of API access?

In Course 1 and 2, you will have access to optional activities where you will be guided to access Llama models via API. The cost of API credits is not included in this program and will be at your own cost. Some API providers may provide free credits when you first register with their platform. Please check with the providers directly to confirm.

What will I get if I subscribe to this Certificate?

When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.

Inference techniques for local and cloud LLM deployment

Inference techniques for local and cloud LLM deployment

本课程是 Building Generative AI Apps with Llama 专业证书的一部分

位教师：Taught by Meta Staff

包含在 Coursera Plus 中

了解更多

高级设置等级

推荐体验

1 小时完成

灵活的计划

自行安排学习进度

高级设置等级

推荐体验

1 小时完成

灵活的计划

自行安排学习进度

您将学到什么

The principles of LLM inference and prompt pipelines for real-world tasks.
Running small and medium LLMs locally with Ollama and deploying larger models in the cloud using Python.
Building and documenting LLM-powered tools ready for real-world use.

要了解的详细信息

可分享的证书

添加到您的领英档案

授课语言：英语（English）

了解顶级公司的员工如何掌握热门技能

了解关于 Coursera for Business 的更多信息

Petrobras, TATA, Danone, Capgemini, P&G 和 L'Oreal 的徽标

积累 Software Development 领域的专业知识

本课程是 Building Generative AI Apps with Llama 专业证书专项课程的一部分

在注册此课程时，您还会同时注册此专业证书。

向行业专家学习新概念
获得对主题或工具的基础理解
通过实践项目培养工作相关技能
通过 Meta 获得可共享的职业证书

从 Software Development 浏览更多内容

Meta
Foundations of LLMs and Llama development
课程
Meta
Advanced assistant customization with fine-tuning and context
课程

人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

自 2018开始学习的学生

''能够按照自己的速度和节奏学习课程是一次很棒的经历。只要符合自己的时间表和心情，我就可以学习。'

Jennifer J.

自 2020开始学习的学生

''我直接将从课程中学到的概念和技能应用到一个令人兴奋的新工作项目中。'

Larry W.

自 2021开始学习的学生

''如果我的大学不提供我需要的主题课程，Coursera 便是最好的去处之一。'

Chaitanya A.

''学习不仅仅是在工作中做的更好：它远不止于此。Coursera 让我无限制地学习。'

通过 Coursera Plus 开启新生涯

无限制访问 10,000+ 世界一流的课程、实践项目和就业就绪证书课程 - 所有这些都包含在您的订阅中

了解更多

通过在线学位推动您的职业生涯

获取世界一流大学的学位 - 100% 在线

探索学位

加入超过 3400 家选择 Coursera for Business 的全球公司

提升员工的技能，使其在数字经济中脱颖而出

了解更多

常见问题

Developers, entrepreneurs, and technical professionals with 1-2 years of Python knowledge who want to build AI-enabled assistants. Ideal for those looking to upskill in generative AI development or create practical business solutions using Llama models.

1-2 years of Python programming experience (for those who need to meet this prerequisite, start with the Meta Programming in Python course)

Familiarity with command-line interfaces

Understanding of basic software development concepts

Basic knowledge of REST APIs

In this program, you will be guided to access the Llama 4 Scout 17B and Llama 3.1 8B models via API in Courses 1 and 2. The course content includes examples using one of the API providers, but you are free to choose any provider that offers access to Llama models for your learning experience. Examples of such providers include Together AI, Groq, Hugging Face, and others.

In Course 3, you will be guided to use the Llama 3.1 8B model in a local environment. Llama models are available from multiple sources, with the best place to start being https://www.llama.com.

Models are also hosted and distributed by partners such as Amazon Web Services, Microsoft Azure, Google Cloud, IBM Watsonx, Oracle Cloud, Snowflake, Databricks, Dell, Hugging Face, Groq, Cerebras, SambaNova, and many others. See the Llama.com FAQ for more information.

In Courses 1 and 2, you will access Llama models through an API. No additional device requirements are needed, but you will need internet access and to select your own API provider.

In Course 3, you will learn to use Llama in a local environment. Hardware requirements vary based on the specific Llama model being used, latency, throughput and cost constraints. For the larger Llama models to achieve low latency, one would split the model across multiple inference chips (typically a GPU) with tensor parallelism. Llama models are known to execute in a performant manner on a wide variety of hardware including GPUs, CPUs (both x86 and ARM based), TPUs, NPUs and AI Accelerators. The smaller Llama models typically run on system-on-chip (SOC) platforms found on PC, Mobile and other Edge devices.

See the Llama.com FAQ for more information.

To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.