The Model Evaluation and Benchmarking course is designed for developers, engineers, and technical product builders who are new to Generative AI but already have intermediate machine learning knowledge, basic Python proficiency, and familiarity with development environments such as VS Code, and who want to engineer, customize, and deploy open generative AI solutions while avoiding vendor lock-in.

Model Evaluation and Benchmarking
访问权限由 Coursera Learning Team 提供
您将获得的技能
要了解的详细信息

添加到您的领英档案
2 项作业
February 2026
了解顶级公司的员工如何掌握热门技能

积累 Machine Learning 领域的专业知识
- 向行业专家学习新概念
- 获得对主题或工具的基础理解
- 通过实践项目培养工作相关技能
- 通过 Coursera 获得可共享的职业证书

该课程共有3个模块
Learn how to evaluate text models using both automated metrics and human-centered methods. You’ll apply key measures like perplexity, BLEU (Bilingual Evaluation Understudy), ROUGE (Recall-Oriented Understudy for Gisting Evaluation), and BERTScore, and understand when each is most useful. You’ll also design human evaluation protocols and build automated pipelines, giving you a practical way to judge whether your fine-tuned models improve performance.
涵盖的内容
4个视频2篇阅读材料1个作业1个非评分实验室
Explore how to measure the quality of images produced by diffusion and other generative models. You’ll implement technical metrics like Fréchet Inception Distance (FID), Structural Similarity Index Measure (SSIM), and Contrastive Language–Image Pretraining (CLIP) similarity, and balance them with human perception-based checks for style, accuracy, and consistency. You’ll also automate artifact detection and quality control, equipping you with the skills to catch hidden flaws and ensure your image outputs meet professional standards.
涵盖的内容
3个视频1篇阅读材料1个非评分实验室
Learn how to design benchmarks that make model comparisons reliable and reproducible. You’ll create domain-specific evaluation datasets, build dashboards to visualize results, and automate reporting systems for continuous monitoring. These practices help you track improvements, catch performance issues early, and build trust in your work through transparent, repeatable evaluations.
涵盖的内容
3个视频1篇阅读材料1个作业1个非评分实验室
获得职业证书
将此证书添加到您的 LinkedIn 个人资料、简历或履历中。在社交媒体和绩效考核中分享。
位教师

提供方
人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

Jennifer J.

Larry W.







