Modern cloud-native applications rarely crash outright. Instead, they fail in subtle ways such as latency spikes, partial errors, or noisy dependencies. This course helps you become productive with the open-source trio used across the industry: Prometheus for metrics and PromQL analysis, Grafana for dashboards and alerting, and OpenTelemetry for standard, vendor-neutral instrumentation.

您将学到什么
Explain the roles of metrics, logs, and traces and map Prometheus, Grafana, and OpenTelemetry to each signal in a modern stack.
Deploy a minimal local stack (Docker or native): scrape metrics with Prometheus, route telemetry via OTel Collector, and visualize in Grafana.
Instrument a sample app with OpenTelemetry, confirm traces/metrics flow end-to-end, and build a basic Grafana dashboard.
您将获得的技能
要了解的详细信息

添加到您的领英档案
1 项作业
February 2026
了解顶级公司的员工如何掌握热门技能

该课程共有3个模块
Familiarize yourself with the three primary observability signals—metrics, logs, and traces—and understand how Prometheus, Grafana, and OpenTelemetry correspond to each. We will comprehensively examine the entire data pathway, clarifying the roles of pull versus push mechanisms and exporters versus receivers. Subsequently, you will set up a small local environment using Docker Compose, which will be reused throughout this course. By the conclusion, you will have established a functional laboratory environment where targets are operationally marked in green, and data flows seamlessly.
涵盖的内容
4个视频2篇阅读材料1次同伴评审
Acquire knowledge of the fundamental components of PromQL essential for daily use: rate(), sum by(), label filters, and histogram quantiles—while avoiding typical pitfalls associated with counters and gauges. Subsequently, transform queries into meaningful signals through the development of a clear three-panel Grafana dashboard displaying RPS, error ratio, and 95th percentile latency, all equipped with appropriate units, legends, and variables. Export the dashboard as JSON and configure a noise-aware alert (error rate >5% over 5 minutes) to practice setting thresholds in relation to time windows. The emphasis is on maintaining practical panel organization and creating queries that can be clearly explained.
涵盖的内容
3个视频1篇阅读材料1次同伴评审
Implement the demo application with an OpenTelemetry (OTel) Software Development Kit (SDK), establish meaningful resource attributes, and export data via the OpenTelemetry Protocol (OTLP) to a Collector pipeline, which you will configure (receivers → processors → exporters). You will visualize traces using Grafana/Tempo and learn how to navigate from a “hot” metric dashboard directly to the related spans using exemplars. Throughout the process, you will validate the health of the pipeline, incorporate attributes and batching, and practice root-cause analysis on induced failures. The session concludes with next steps including label management, Service Level Objectives (SLOs) and burn rates, as well as retention/export strategies for production environments.
涵盖的内容
4个视频1篇阅读材料1个作业2次同伴评审
提供方
人们为什么选择 Coursera 来帮助自己实现职业发展

Felipe M.

Jennifer J.

Larry W.







