The Optimizing Models for Production course prepares learners to make generative AI models more efficient, scalable, and cost-effective for real-world deployment. Learners begin with quantization, applying INT8 and INT4 precision reduction using tools like bitsandbytes while balancing accuracy and efficiency. Next, they explore inference optimization strategies, including batching, KV-cache management, and token-level computation scheduling to reduce latency in interactive applications.
即将结束: 只需 199 美元(原价 399 美元)即可通过 Coursera Plus 学习新技能。立即节省

了解顶级公司的员工如何掌握热门技能

该课程共有4个模块
涵盖的内容
1个非评分实验室
涵盖的内容
1个非评分实验室
涵盖的内容
1个非评分实验室
涵盖的内容
1个非评分实验室
位教师

提供方
人们为什么选择 Coursera 来帮助自己实现职业发展




常见问题
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
更多问题
提供助学金,




